Thursday, March 2, 2017

A PowerShell function to rapidly gather system events for sysadmin eyes only with some tips

I suppose that we, sysadmins, have all been through that moment when an application developer bursts in behind your back just to tell you that his perfectly-coded application is getting stuck and he goes on stating that for sure something has happened at system level and you should already be checking.

This is the moment when being good at PowerShell comes to the rescue. Because with PowerShell you can quickly write a tool that allows you to check your event logs and extract just the right kind of information to show that the system has no issues (as it's often the case) and that he better be reviewing his software configuration.

PowerShell basically is your Splunk, but without the price tag.




That's the topic I am going to talk about in this post: I am going to show you how you can use PowerShell to gather event logs quickly from one or more computers, no matter the Windows version, and build a report of recent system issues, excluding each and every event coming from the upper application layers.

The first step here is to understand that there are two worlds: servers running Windows versions till Windows 2003 R2, and servers running Windows 2008 and above. These two types of servers have different engines for event logging and therefore Microsoft has provided two different cmdlets.

AN HISTORY OF TWO CMDLETS

The first cmdlet is Get-EventLog and is used with servers running Windows 2003 R2. I am sure you still have a few of those running. I do, so I have to take them into consideration when developing my function.

The second cmdlet is Get-WinEvent, which is used on newer system and, despite the fact that it runs much faster than Get-EventLog, it can't be used to check older systems.

So you have to write this function in a way that it first checks for the possibility to query the remote server using Get-WinEvent, and if it fails, it has to fail back to Get-EventLog.

LOGTYPE AND LOGISOLATION FOR THE WIN

Before you write the Get-WinEvent part, there are a few things to understand. As I said the aim of the function is to allow the system administrator to extract only the events that are related to the operating system. To do so, you have first to build a query to retrieve all of your logs:


After researching a bit, I have come to understanding that I have to rely mainly on two properties which I have highlighted in the screenshot above: LogType and LogIsolation.

LogType tells you the type of events that are logged in each log and in our case we want to just stick to Administrative events. This includes events from classic event logs, like System, Security or Application, and other interesting logs such as 'Microsoft-Windows-Hyper-V-Worker-Admin' or 'Microsoft-Windows-Kernel-EventTracing/Admin'.

Now unfortunately there are dozens of event logs that record administrative events, and we need to refine that list more if we want our function to include only events that actually may indicate a system issue.

This is where enters the game the LogIsolation property. This property indicates which ACL and which ETW sessions each event log is using: in our case we want to filter out all the logs that share the access control list with the Application event log as well as all the event logs that share the ETW session with other logs that have Application isolation.

Setting Get-WinEvent to filter on LogIsolation -eq 'system' will guarantee that we are not checking events that have been written by the applications.

Once we filter on those two properties, with the following line of code

Get-WinEvent -ListLog *| ? {($_.logtype -eq 'administrative') -and ($_.logisolation -eq 'system')}
we get a much shorter list of logs, and all of them are clearly under the responsibility of the system administrator:


No we have to find a way to perform this very same operation with Get-EventLog. That's easily accomplished, since we just have to choose one between three classic event logs: System, Application or Security.

For our purpose, System is the log we need.

Now there is a well-known issue with Get-EventLog: it is dramatically slow since it's been designed in a way that it retrieves the whole event log starting from the oldest record each time it's queried. That can lead to a painfully slow checkup of your environment if you are running your function against a large number of servers.

BOOSTING GET-EVENTLOG PERFORMANCE

The hint I can give you here is make it work the other way around by adding the -Newest parameter followed by the number of records you want to get: this forces Get-EventLog to start from the most recent event and to access the most recent ones by their unique index until it gets to fetch the asked number of items:

As you can test, with the -Newest parameter the query will be executed very fast. In my function I set its value to 1000 so I am pretty sure no recent critical events are being left over.

FILTER RIGHT (Yes, you read well)

But there is a problem: as you have understood, we are trying to build a tool that allows the system administrator to check if there is any kind of system issue, so we have to be able to limit the search window to the most recent hours. In Get-EventLog this is normally achieved with the -After parameter but the drawback of adding it is that it overrides the functioning of -Newest by bringing back the older-to-newer query mechanism. The performance impact is impressive:

"Newest With -after"
(Measure-Command { Get-EventLog -LogName System -Newest 1000 -After (Get-Date).AddHours(-24) }).TotalSeconds

"Newest with a Where-Object filtering"
(Measure-Command { Get-EventLog -LogName System -Newest 1000 | ? TimeGenerated -gt (Get-Date).AddHours(-24) }).TotalSeconds

Newest With -after
30.50407
Newest with a Where-Object filtering
0.8548345
That's one of the rare cases where right-filtering is faster than left-filtering.

Now we have all the required knowledge around Get-EventLog and Get-WinEvent to retrieve administrative events pretty quickly.

Before we continue, we have to understand one more thing: these two cmdlets bring back different object types which have different properties. Since our tool must be able to consolidate events from systems running possibly different versions of Windows, we need a way to match the property names brought back from those cmdlets.

CALCULATED PROPERTIES

A mechanism known as 'Calculated properties' is our ally here. We can use Select-Object to translate property names for objects returned from Get-EventLog to property names returned by Get-WinEvent:


Here we are basically translating TimeGenerated to TimeCreated, Source to ProviderName, EventId to Id and EntryType to LevelDisplayName. This way all the objects coming through our function will have the same property set and filtering will be done in a breeze.

EXCEPTIONS HANDLING

Any tool must be able to handle exceptions. This is achieved in PowerShell with Try/Catch. My experience is that the use of Get-WinEvent can raise three main exceptions. Here's the first two:

A target server that can't be reached:

Get-WinEvent -LogName System -ComputerName nobody
Get-WinEvent : The RPC server is unavailable
At line:1 char:1
+ Get-WinEvent -LogName System -ComputerName nobody
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Get-WinEvent], EventLogException
A target server that runs Windows 2003:

Get-WinEvent -LogName System -ComputerName IamWindows2003
Get-WinEvent : There are no more endpoints available from the endpoint mapper
At line:1 char:1
+ Get-WinEvent -LogName System -ComputerName IamWIndows2003
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Get-WinEvent], EventLogException
Both these exceptions are of type [System.Diagnostics.Eventing.Reader.EventLogException].

A third exception is raised by Get-WinEvent when no events are found for the given criteria:

Get-WinEvent -FilterHashtable @{LoGName='System';StartTime=(Get-Date).AddHours(-1)}
Get-WinEvent : No events were found that match the specified selection criteria.
At line:1 char:1
+ Get-WinEvent -FilterHashtable @{LoGName='System';StartTime=(Get-Date) ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (:) [Get-WinEvent], Exception
This looks to me like something that should evolve in PowerShell, since I don't see why having no events returned deserves being considered as an error. Here's you can see the source code of Get-WinEvent, which is where the exception is returned:


CONDITIONAL GET-EVENTLOG

As I said at the beginning of this post, we must run Get-EventLog only if Get-WinEvent fails with the exception whose message is 'There are no more endpoints available from the endpoint mapper'. This is how I accomplished this, with Catch then a pattern matching performed with Switch on the exception message:


GET-UNIQUE

There is one thing we really don't want our tool to do: returning three thousands times the same event happening over and over. It would make our report completely useless because no lazy sysadmin is going to scroll through an endless list of repeated events. Ever.

So the last part of our function is in charge of consolidating all identical events for a server into one and return just the date of the most recent occurrence.

The logic is to group by event id, then to sort by generation time in descending order, then select only the first item.

Here's a screenshot of that part of code. It's pretty easy so I won't go deeper in explaining it:


RUNSPACES, AGAIN

I have already showed the power of runspaces and of the PoshRSJob module by Microsoft MVP Boe Prox in a recent blog post. Even in the case of a tool for event retrieval, this is the module to have on your administration box. Nesting my function, which in the end I named Get-AdministrativeEvent into a Runspacepool makes the generation of a report of all my systems run so fast I can't even finish drinking my cup of coffee:

$Report = Start-RSJob -Throttle 20 -Verbose -InputObject ((Get-ADComputer -server dc01 -filter {(name -notlike 'win7*') -AND (OperatingSystem -Like "*Server*")} -searchbase "OU=SRV,DC=Domain,DC=Com").name) -FunctionsToLoad Get-AdministrativeEvent -ScriptBlock {Get-AdministrativeEvent $_ -HoursBack 3 -Credential $using:cred -Verbose} | Wait-RSJob -Verbose -ShowProgress | Receive-RSJob -Verbose

$Report | sort timecreated -descending | Out-GridView

JUST GET ME THE CODE

Here's the whole code for the function (which you can also find on my Github).

function Get-AdministrativeEvent {

<#
.Synopsis
The Get-AdministrativeEvent function retrieves the last critical administrative events on a local or remote computer
.EXAMPLE
Get-AdministrativeEvent -cred (get-credential domain\admin) -ComputerName srv01 -HoursBack 1
.EXAMPLE
$cred = get-credential
Get-AdministrativeEvent -cred $cred -ComputerName srv01 -HoursBack 24 | Sort-Object timecreated -Descending | Out-Gridview
.EXAMPLE
'srv01','srv02' | % { Get-AdministrativeEvent -HoursBack 1 -cred $cred -ComputerName $_ } | Sort-Object timecreated -Descending | ft * -AutoSize
.EXAMPLE
Get-AdministrativeEvent -HoursBack 36 -ComputerName (Get-ADComputer -filter *).name | sort timecreated -Descending | Out-GridView
.EXAMPLE
Get-AdministrativeEvent -cred $cred -ComputerName 'srv01','srv02' -HoursBack 12 | Out-Gridview
.EXAMPLE
$Report = Start-RSJob -Throttle 20 -Verbose -InputObject ((Get-ADComputer -server dc01 -filter {(name -notlike 'win7*') -AND (OperatingSystem -Like "*Server*")} -searchbase "OU=SRV,DC=Domain,DC=Com").name) -FunctionsToLoad Get-AdministrativeEvent -ScriptBlock {Get-AdministrativeEvent $_ -HoursBack 3 -Credential $using:cred -Verbose} | Wait-RSJob -Verbose -ShowProgress | Receive-RSJob -Verbose
$Report | sort timecreated -descending | Out-GridView
.EXAMPLE
$Servers = ((New-Object -typename ADSISearcher -ArgumentList @([ADSI]"LDAP://domain.com/dc=domain,dc=com","(&(&(sAMAccountType=805306369)(objectCategory=computer)(operatingSystem=*Server*)))")).FindAll()).properties.name
$Report = Start-RSJob -Throttle 20 -Verbose -InputObject $Servers -FunctionsToLoad Get-AdministrativeEvent -ScriptBlock {Get-AdministrativeEvent $_ -Credential $using:cred -HoursBack 48 -Verbose} | Wait-RSJob -Verbose -ShowProgress | Receive-RSJob -Verbose
$Report | format-table * -AutoSize
.NOTES
happysysadm.com
@sysadm2010
#>

    [CmdletBinding()]
    Param
    (
        # List of computers
        [Parameter(Mandatory=$true,
                   ValueFromPipeline=$true,
                   ValueFromPipelineByPropertyName=$true,
                   Position=0)]
        [Alias('Name','CN')] 
        [string[]]$ComputerName,

        # Specifies a user account that has permission to perform this action
        [Parameter(Mandatory=$false)]
        [System.Management.Automation.PSCredential]
        [System.Management.Automation.Credential()]
        $Credential = [System.Management.Automation.PSCredential]::Empty,

        #Number of hours to go back to when retrieving events
        [int]$HoursBack = 1

    )

    Begin

        {

        Write-Verbose "$(Get-Date) - Started."

        $AllResults = @()

        }
    
    Process
        {

        foreach($Computer in $ComputerName) {
    
            $Result = $Null

            write-verbose "$(Get-Date) - Working on $Computer - Eventlog"

            $starttime = (Get-Date).AddHours(-$HoursBack)
    
            try {

                write-verbose "$(Get-Date) - Trying with Get-WinEvent"
    
                $result = Get-WinEvent -ErrorAction stop -Credential $credential -ComputerName $Computer -filterh @{LogName=(Get-WinEvent -Computername $Computer -ListLog *| ? {($_.logtype -eq 'administrative') -and ($_.logisolation -eq 'system')} | ? recordcount).logname;StartTime=$starttime;Level=1,2} | select machinename,timecreated,providername,logname,id,leveldisplayname,message

                }

            catch [System.Diagnostics.Eventing.Reader.EventLogException] {
        
                switch -regex ($_.Exception.Message) {

                    "RPC" { 
            
                        Write-Warning "$(Get-Date) - RPC error while communicating with $Computer"
                
                        $Result = 'RPC error'
                
                        }
        
                    "Endpoint" { 
            
                        write-verbose "$(Get-Date) - Trying with Get-EventLog for systems older than Windows 2008"

                        try { 
                
                            $sysevents = Get-EventLog -ComputerName $Computer -LogName system -Newest 1000 -EntryType Error -ErrorAction Stop | `

                                            ? TimeGenerated -gt $starttime | `

                                            select MachineName,
                                            
                                                   @{Name='TimeCreated';Expression={$_.TimeGenerated}},
                                                   
                                                   @{Name='ProviderName';Expression={$_.Source}},
                                                   
                                                   LogName,
                                                   
                                                   @{Name='Id';Expression={$_.EventId}},
                                                   
                                                   @{Name='LevelDisplayName';Expression={$_.EntryType}},
                                                   
                                                   Message

                            if($sysevents) {

                                $result = $sysevents

                                }

                            else {

                                Write-Warning "$(Get-Date) - No events found on $Computer"
                        
                                $result = 'none'

                                }
                    
                            }

                        catch { $Result = 'error' }
                
                        }

                    Default { Write-Warning "$(Get-Date) - Error retrieving events from $Computer" }
                
                    }

                }
        
            catch [Exception] {
        
                Write-Warning "$(Get-Date) - No events found on $Computer"
        
                $result = 'none'

                }

        if(($result -ne 'error') -and ($result -ne 'RPC error') -and ($result -ne 'none')) {

            Write-Verbose "$(Get-Date) - Consolidating events for $Computer"
            
            $lastuniqueevents = $null

            $lastuniqueevents = @()
            
            $ids = ($result | select id -unique).id

            foreach($id in $ids){

                $machineevents = $result | ? id -eq $id

                $lastuniqueevents += $machineevents | sort timecreated -Descending | select -first 1

                }

            $AllResults += $lastuniqueevents

            }
    
        }

    }

    End {
        
        Write-Verbose "$(Get-Date) - Finished."
    
        $AllResults
        
        }

}
Let me go if you have any idea on how to improve it, or if you have any question, do not hesitate to ask.

1 comment:

  1. Created this for non-2003 boxes: https://gallery.technet.microsoft.com/scriptcenter/Query-administrative-f530fb24

    ReplyDelete

Related Posts Plugin for WordPress, Blogger...