Enumerating process, thread, and image load notification callback routines in Windows

Most people are familiar with the fact that Windows contains a wide variety of kernel-mode callback routines that driver developers can opt into to receive various event notifications. This blog post will explain exactly how some of these function under the hood. In particular, we'll investigate how the process creation and termination callbacks (nt!PsSetCreateProcessNotifyRoutine, nt!PsSetCreateProcessNotifyRoutineEx, and nt!PsSetCreateProcessNotifyRoutineEx2), thread creation and termination callbacks (nt!PsSetCreateThreadNotifyRoutine and nt!PsSetCreateThreadNotifyRoutineEx), and image load notification callbacks (nt!PsSetLoadImageNotifyRoutine) work internally. Furthermore, we'll release a handy WinDbg script that will let you enumerate these different types of callbacks.

If you'd like to follow along, I'll be using system files from Windows x64 10.0.15063 (Creator's Update). All pseudo-source and disassembly is reconstructed from that specific release.

Don't have a kernel debugging environment set up? Don't fret. You can follow our tutorial on how to setup basic kernel debugging using WinDbg and VMware here.

Without further ado, let's begin.

What do these callbacks do?

These callbacks can be used by driver developers to gain notifications when certain events happen. For example, the basic process creation callback,  nt!PsSetCreateProcessNotifyRoutine, registers a user-defined function pointer ("NotifyRoutine") that will be invoked by Windows each time a process is created or deleted. As part of the event notification, the supplied handler gets a wealth of information. In our example, this will include the parent process' (if one exists) PID, the actual process' PID, and a boolean value that will let us know if the process is being created or if it's terminating. 

Security software leverages these callbacks to be able to carefully inspect code running on the machine. 

Divin' deep

The documented APIs

Our investigation has to begin somewhere. What better place than at the start of a documented function? We turn to nt!PsSetCreateProcessNotifyRoutine. MSDN claims that this routine has been around since Windows 2000. Even our friends at ReactOS seem to have implemented this functionality a long time ago. We'll see exactly how (if at all) things have changed in the 17 years from Windows 2000 until now.

This function just seems to call an implementer routine, nt!PspSetCreateProcessNotifyRoutine. In fact, this same routine is invoked for the other variations, nt!PsSetCreateProcessNotifyRoutineEx and nt!PsSetCreateProcessNotifyRoutineEx2:

The only difference is in the second parameter being passed to nt!PspSetCreateProcessNotifyRoutine. These are effectively flags. In the base case (nt!PsSetCreateProcessNotifyRoutine), these flags can either be 1 or 0 depending on the state of the "Remove" parameter. If "Remove" is TRUE, Flags=1. If "Remove" is FALSE, Flags=0. In the extended case (nt!PsSetCreateProcessNotifyRoutineEx), the flags can take on the value 2 or 3:

Finally, for nt!PsSetCreateProcessNotifyRoutineEx2, these flags will take on the value 6 or 7:

Therefore, one can imply that the flags passed to nt!PspSetCreateProcessNotifyRoutine have this definition:

The undocumented world

nt!PspSetCreateProcessNotifyRoutine is slightly complicated. I've defined it below, but I strongly recommend opening it in another window and following the text to ease understanding.

Luckily for us, a lot of the internal data structures related to callback routines haven't changed since Windows 2000. The trailblazers at ReactOS have been spot-on with their structure definitions so we'll use them, when possible, to avoid duplicating work.

For each callback, there's a global array that can contain up to 64 entries. In our case, the start of this array for process creation callbacks is located at nt!PspCreateProcessNotifyRoutine. Each entry in this array is of type _EX_CALLBACK:

To avoid synchronization problems, nt!ExReferenceCallBackBlock is used which will safely acquire a reference to the underlying callback object, _EX_CALLBACK_ROUTINE_BLOCK (documented below). We can effectively reproduce the same behavior in a non-thread safe way via:

If we're deleting a callback object ("Remove" is TRUE), we need to make sure that we can find the appropriate _EX_CALLBACK_ROUTINE_BLOCK in the array. This is done by checking first if the target "NotifyRoutine" matches that of the current _EX_CALLBACK_ROUTINE with nt!ExGetCallBackBlockRoutine:

Then, we check to see if it's the right type (created with the correct version of (nt!PsSetCreateProcessNotifyRoutine/Ex/Ex2), by using nt!ExGetCallBackBlockContext:

At this point, we've found the entry in the array. We will erase it by setting the _EX_CALLBACK value to NULL via nt!ExCompareExchangeCallback, decrementing the appropriate global counter (nt!PspCreateProcessNotifyRoutineExCount or nt!PspCreateProcessNotifyRoutineCount), dereferencing the _EX_CALLBACK_ROUTINE_BLOCK with nt!ExDereferenceCallBackBlock, waiting for any other code using the _EX_CALLBACK (nt!ExWaitForCallBacks), and finally freeing memory (nt!ExFreePoolWithTag). As you can see, great care is taken by Microsoft to not free a callback object that is in use.

If we can't find the entry to remove in the nt!PspCreateProcessNotifyRoutine array after exhausting all 64 possibilities, the STATUS_PROCEDURE_NOT_FOUND error message is returned.

On the other hand, if we're adding a new entry into the callback array, things are a little easier. A sanity check is performed by nt!MmVerifyCallbackFunctionCheckFlags to ensure that the "NotifyRoutine" is present in a loaded module. This helps avoid unlinked drivers (or shellcode) from receiving callback events:

After we pass the sanity check, an _EX_CALLBACK_ROUTINE_BLOCK is allocated via nt!ExAllocateCallBack. This routine confirms the size and layout of the _EX_CALLBACK_ROUTINE_BLOCK structure:

To wrap up, the newly allocated _EX_CALLBACK_ROUTINE_BLOCK is added to a free (NULL) location in the nt!PspCreateProcessNotifyRoutine array using nt!ExCompareExchangeCallBack (ensuring that it doesn't overflow the 64 limit maximum). Finally, the appropriate global counter is incremented and a global flag is set in nt!PspNotifyEnableMask denoting that there are callbacks of the user-specified type registered on the system.

The other callbacks

Thankfully, thread and image creation callbacks are very similar to process callbacks. They utilize the same underlying data structures. The only difference is that thread creation/termination callbacks are stored in the nt!PspCreateThreadNotifyRoutine array and that image load notification callbacks are stored in nt!PspLoadImageNotifyRoutine.

The script

It's finally time to put what we know to good use. Using WinDbg, we can create a simple script to automagically enumerate process, thread, and image callback routines.

Instead of leveraging WinDbg's built-in scripting engine, I've elected to use something a little less disgusting. There's a great 3rd party extension for WinDbg called PyKd that enables Python scripting in WinDbg. Installing it is very straightforward. You'll need a copy of the appropriate bitness (e.g. 64-bit for 64-bit install of WinDbg) of Python for this to work.

The script should be easy to follow. I tried to document it as best I could. It should also be compatible, at a minimum, with all forms of Windows from XP and up (both 32-bit and 64-bit flavors).

After running the script using the "!py" command, you should see output similar to this:

Final thoughts

Knowing how the callback system functions in Windows allows us to do very interesting things. As seen above, we're able to programmatically iterate through each callback array and discover all registered callbacks. This is very useful for forensic purposes.

Furthermore, these underlying array lists aren't under the protection of PatchGuard. Since registering callbacks is more-or-less a requirement for anti-virus products in order to develop a useful driver that plays nicely with PatchGuard on x64 systems, malware could dynamically disable (or replace) these registered callbacks to thwart security protection solutions. The possibilities are endless.

Special thanks to the folks at ReactOS for their meticulous documentation. In particular, most of the structures I used were identified by Alex Ionescu for ReactOS a long time ago. Additionally, kudos to the folks that make PyKd. It's a much better alternative to the native scripting interface for WinDbg, in my opinion!

As always, if y'all have any questions or comments, please feel free to comment below. Suggestions are greatly appreciated too! 


Popular posts from this blog

Setting up kernel debugging using WinDbg and VMware

Exploring Windows virtual memory management