Loading kernel symbols - VMM debugging using VMware's GDB stub and IDA Pro - Part 2
This article assumes you've read the first part of the series. In particular, at this point you should have successfully setup VMware's GDB stub and IDA Pro's GDB debugger. You should now be in a connected state and broken into IDA Pro's debugger GUI.
Furthermore, the focus of this post is going to be exclusively on loading kernel symbols for 64-bit editions of Windows (AMD64). Different operating systems (and different architectures of Windows) require slight modifications to the article's logic.
Where's
We are then able to force IDA Pro to load symbol data (PDBs) at ntoskrnl's base address to have useful debugging information. From there, we can enumerate the linked list, nt!PsLoadedModuleList, to figure out where other kernel mode components are located. However, this isn't trivial. When you break in to IDA Pro's GDB debugger, it's difficult to know what state you'll be in on any given processor. You might be executing code in a usermode process, or you might be busy servicing a system call. Additionally, you're further restricted to the functionality the GDB stub exposes.
It seems very relevant to us. Let's give it a try:
Looks like it's working. This is the same output we received from the GDB stub when we issued the help command.
Easy!
To get to the handler (e.g. where the processor moves control to when an interrupt occurs), the target address is built from the OffsetHigh, OffsetMiddle, and OffsetLow fields of this structure using the following algorithm: HandlerAddress = ((OffsetHigh << 32) + (OffsetMiddle << 16) + OffsetLow).
We'll leverage the Dbg* commands to read virtual memory from IDAPython. Since we're extracting the first IDT entry, we can just read directly from the start of our idt_base:
This shows us that the handler for the first IDT entry (nt!KiDivideErrorFault) is loaded at 0xfffff802c27f4300. If we wanted to read the N'th IDT entry, we'd have to index into the array by adding 0x10, the size of a _KIDTENTRY64, times the location in the array (in this case N). So, to index into the 3rd IDT entry, we'd apply the following math: idt_entry = idt_base + (0x10 * 2).
Voila! The base address of ntoskrnl is discovered at 0xfffff802c2680000.
The important line appears on the bottom; the base address of ntoskrnl is displayed. It checks out with the work we did by hand too.
First, we'll need to grab a copy of ntoskrnl on the VM. Don't use the version on your host as this may not match with what's on the VM. This'll be found in your guest's system directory:
You might need to resume your VM if you're currently active in IDA's GDB debugger by selecting "Debugger" and then "Continue process" (or by hitting F9) from the menu bar.
After you've pulled ntoskrnl from your VM, break into IDA's GDB debugger by selecting "Suspend". Now, we must load it by selecting "File" then "Load file" and finally "PDB file..."
Find where you copied ntoskrnl to on your host and use the address that the script found:
It'll take IDA at least a couple of minutes to fully finish the loading process. You can see IDA's progress in the bottom left corner:
You'll know IDA's finished when the status changes to "AU: idle".
Furthermore, the focus of this post is going to be exclusively on loading kernel symbols for 64-bit editions of Windows (AMD64). Different operating systems (and different architectures of Windows) require slight modifications to the article's logic.
Where's Waldo ntoskrnl?
The end goal
The first and most important thing is to discover where the NT Kernel (ntoskrnl.exe) is loaded in memory since it's not at any fixed (static) address thanks to address space layout randomization (ASLR).We are then able to force IDA Pro to load symbol data (PDBs) at ntoskrnl's base address to have useful debugging information. From there, we can enumerate the linked list, nt!PsLoadedModuleList, to figure out where other kernel mode components are located. However, this isn't trivial. When you break in to IDA Pro's GDB debugger, it's difficult to know what state you'll be in on any given processor. You might be executing code in a usermode process, or you might be busy servicing a system call. Additionally, you're further restricted to the functionality the GDB stub exposes.
Enter the _KPCR
On all architectures and versions of Windows, each processor maintains a control structure dubbed as the _KPCR (Kernel Processor Control Region). This structure is massive and it can be used to infer exactly what the processor is doing. On Windows 10 (15063.0.amd64fre.rs2_release.170317-1834), the _KPCR is 0x6bc0 bytes large. It contains many kernel pointers that we can leverage to figure out exactly where the base of ntoskrnl is in memory. A link detailing the members of the _KPCR can be found here.
This structure can be accessed through its virtual address or through the fs segment on x86 and the gs segment on x64. In fact, if you've done any reverse engineering of the Windows kernel, you should have seen many examples of Windows itself accessing members of the _KPCR through the segment selector.
For example, when an int 3 (a software breakpoint; 0xCC) is executed by the processor, control is redirected by the CPU to a handler registered in the appropriate position of the IDT (Interrupt Descriptor Table). We'll touch more on this process later. In Windows, the handler for software breakpoints is nt!KiBreakpointTrap. Here is a snippet of the assembly code of the handler under AMD64:
In particular, at address 0x00000001401749FD we see a swapgs instruction. Since the gs selector means different things in user-mode (_TEB) and the kernel (_KPCR), this instruction is utilized to ensure that we're operating on the kernel-mode construct (_KPCR). Immediately following that instruction at address 0x0000000140174A00, we have an access of the gs segment with a mov r10, gs:188h. The astute reader will realize that upon execution of this instruction, r10 will contain the pointer from the _KPCR.Prcb.CurrentThread. This is discerned from the definition of the structure's members posted above. A breakdown of this process can be illustrated below:
We don't know the _KPCR's exact linear address (it too isn't allocated at a fixed location), but we should be able to access it through the segment selector, though, just like the Windows kernel does. This approach might seem like the ideal one, but, unfortunately, we're further restricted by the functionality of the GDB stub. Let's see what the GDB stub exposes by issuing help:
There are only three major commands available: help, r, and linuxoffsets. We've just executed help, and linuxoffsets isn't relevant to us since we're debugging a Windows kernel. The only other command is r. At first, r looks very useful to us. However, on closer examination, we can see that the GDB stub is unable to read arbitrary offsets off of the gs selector, e.g. the _KPCR.Prcb.CurrentThread from gs:188h by executing r gs:188h.
At least executing r gs without an offset produces data:
This command should get us the base of the gs selector. We then should be able to define a _KPCR structure at that location using IDA Pro. According to the GDB stub, though, our base is 0. If we go to that memory location in the "IDA View - RIP" tab by pressing 'G' and entering 0 in the "Jump to address" window, we don't see anything there:
Unfortunately, you're a sucker for pain and are following this tutorial down to the T. In 64-bit (long) mode on x64, the cs, ss, ds, and es segment selectors have a zero-forced base address. gs and fs are the exceptions and have a non-zero base address. So, how is it possible that the base of the gs selector is 0 when Windows itself uses the segment selector to retrieve processor state?
The answer is in the model-specific registers, MSRs. MSRs are per-processor registers that can be read via rdmsr and written via wrmsr instructions. On x64, the IA32_GS_BASE (0xC0000101) and IA32_KERNEL_GS_BASE (0xC0000102) MSRs are used for storage of the base address of the gs selector. swapgs was introduced to exchange the address of the current gs base register with the value contained in the IA32_KERNEL_GS_BASE MSR.
This means that we could, theoretically, read the IA32_GS_BASE MSR if we're executing code in CPL0 (ring0/kernel-mode). This would get us the base address of the gs segment. However, that's not directly possible through the VMware GDB stub. There is no support for reading or writing to MSRs directly.
This structure can be accessed through its virtual address or through the fs segment on x86 and the gs segment on x64. In fact, if you've done any reverse engineering of the Windows kernel, you should have seen many examples of Windows itself accessing members of the _KPCR through the segment selector.
For example, when an int 3 (a software breakpoint; 0xCC) is executed by the processor, control is redirected by the CPU to a handler registered in the appropriate position of the IDT (Interrupt Descriptor Table). We'll touch more on this process later. In Windows, the handler for software breakpoints is nt!KiBreakpointTrap. Here is a snippet of the assembly code of the handler under AMD64:
In particular, at address 0x00000001401749FD we see a swapgs instruction. Since the gs selector means different things in user-mode (_TEB) and the kernel (_KPCR), this instruction is utilized to ensure that we're operating on the kernel-mode construct (_KPCR). Immediately following that instruction at address 0x0000000140174A00, we have an access of the gs segment with a mov r10, gs:188h. The astute reader will realize that upon execution of this instruction, r10 will contain the pointer from the _KPCR.Prcb.CurrentThread. This is discerned from the definition of the structure's members posted above. A breakdown of this process can be illustrated below:
We don't know the _KPCR's exact linear address (it too isn't allocated at a fixed location), but we should be able to access it through the segment selector, though, just like the Windows kernel does. This approach might seem like the ideal one, but, unfortunately, we're further restricted by the functionality of the GDB stub. Let's see what the GDB stub exposes by issuing help:
There are only three major commands available: help, r, and linuxoffsets. We've just executed help, and linuxoffsets isn't relevant to us since we're debugging a Windows kernel. The only other command is r. At first, r looks very useful to us. However, on closer examination, we can see that the GDB stub is unable to read arbitrary offsets off of the gs selector, e.g. the _KPCR.Prcb.CurrentThread from gs:188h by executing r gs:188h.
At least executing r gs without an offset produces data:
This command should get us the base of the gs selector. We then should be able to define a _KPCR structure at that location using IDA Pro. According to the GDB stub, though, our base is 0. If we go to that memory location in the "IDA View - RIP" tab by pressing 'G' and entering 0 in the "Jump to address" window, we don't see anything there:
What changed from x86 to x64?
If you ran this test on a VM running on an x86 (32-bit) version of Windows and substituted fs for gs, the base of the fs selector would not be 0. It would be a valid memory location. You would then have the address of the _KPCR and could continue on your merry way.
The answer is in the model-specific registers, MSRs. MSRs are per-processor registers that can be read via rdmsr and written via wrmsr instructions. On x64, the IA32_GS_BASE (0xC0000101) and IA32_KERNEL_GS_BASE (0xC0000102) MSRs are used for storage of the base address of the gs selector. swapgs was introduced to exchange the address of the current gs base register with the value contained in the IA32_KERNEL_GS_BASE MSR.
This means that we could, theoretically, read the IA32_GS_BASE MSR if we're executing code in CPL0 (ring0/kernel-mode). This would get us the base address of the gs segment. However, that's not directly possible through the VMware GDB stub. There is no support for reading or writing to MSRs directly.
A shimmer in the shadows
Nevertheless, through persistence, we come up with an approach that plays nicely given our constraints. There are multiple ways to skin a cat and this approach may not be the most elegant solution, but it should work nicely for all x64 Windows kernels.
The basic idea is to leverage the IDT, the interrupt descriptor table, to find a symbol that's in the address space of ntoskrnl. We can access the idtr, a register that houses the IDT, through the GDB stub:
Once we have the base of the IDT, in our case 0xfffff802c4850000, we can access the first entry of the IDT. This should resolve to a symbol within ntoskrnl (nt!KiDivideErrorFault):
From there, we can walk kernel memory backwards until we get to a valid PE header. Since the symbol is contained within ntoskrnl's address space, the first valid PE header should belong to ntoskrnl:
The basic idea is to leverage the IDT, the interrupt descriptor table, to find a symbol that's in the address space of ntoskrnl. We can access the idtr, a register that houses the IDT, through the GDB stub:
Once we have the base of the IDT, in our case 0xfffff802c4850000, we can access the first entry of the IDT. This should resolve to a symbol within ntoskrnl (nt!KiDivideErrorFault):
From there, we can walk kernel memory backwards until we get to a valid PE header. Since the symbol is contained within ntoskrnl's address space, the first valid PE header should belong to ntoskrnl:
Figure 1: Layout of kernel memory.
Writing an IDA script using IDAPython
It'd be nice to programmatically implement the algorithm described above so we don't need to manually go through it each time we're trying to discover the base address of ntoskrnl. We'll do this by writing a script for IDA Pro to run. I chose to do this with IDAPython instead of IDC (IDA's C-like bindings) because of the niceties that Python provides (like string manipulation).
The basics
We'll start by switching the input from "GDB" to "Python" in the "Output window". If your "Output window" is missing, you can restore it by selecting "Windows" and then "Output window" from the menu bar:
We can see all the functionality exposed by IDAPython by executing the Python command dir() in the text box. If you try to do this, you'll see lots of output. It's easy to feel overwhelmed. Luckily, there exists amble documentation on the Hex-Rays website that can help us navigate these murky waters.
I try to find useful things by searching for it first in the dir() listing. You can position your cursor in the "Output window" and press Alt+T to search for a keyword. To find the next occurrence, you can hit Ctrl+T. If this fails, I move on to the documentation.
Sending a command to the GDB stub
Our first task is to figure out how to send a command to the GDB stub. If you search for the "command" keyword in the "Output window" you'll find something labeled "SendDbgCommand". Let's see what this function does by executing help(SendDbgCommand):
It seems very relevant to us. Let's give it a try:
Looks like it's working. This is the same output we received from the GDB stub when we issued the help command.
Parsing the response from the GDB stub
Now that we know how to send a command to the GDB stub, we need to issue a command to retrieve the contents of the idtr. We then parse and extract the base address from the resulting string.
It's important to tell Python that we're working with an integer object by "casting" the string to an integer-type:Easy!
Getting the first IDT entry's handler
We have the base of the IDT in idt_base. Our next task is to retrieve the first entry in the IDT. The IDT is effectively an array that contains 256 IDT entries (0-0xFF) on x64. The format of the IDT is dictated by the architecture of the processor (e.g. Intel x64). Each IDT entry on x64 takes the following form:To get to the handler (e.g. where the processor moves control to when an interrupt occurs), the target address is built from the OffsetHigh, OffsetMiddle, and OffsetLow fields of this structure using the following algorithm: HandlerAddress = ((OffsetHigh << 32) + (OffsetMiddle << 16) + OffsetLow).
We'll leverage the Dbg* commands to read virtual memory from IDAPython. Since we're extracting the first IDT entry, we can just read directly from the start of our idt_base:
This shows us that the handler for the first IDT entry (nt!KiDivideErrorFault) is loaded at 0xfffff802c27f4300. If we wanted to read the N'th IDT entry, we'd have to index into the array by adding 0x10, the size of a _KIDTENTRY64, times the location in the array (in this case N). So, to index into the 3rd IDT entry, we'd apply the following math: idt_entry = idt_base + (0x10 * 2).
Finding the base address from a symbol within ntoskrnl
First, we'll define a simple helper function that will align addresses to their page boundaries. This will help speed up our lookup because we know that the base address of ntoskrnl will be on a page boundary:
We'll then create a very simple loop to walk memory backwards (on a page-aligned boundary) searching for the magical value 0x5A4D, commonly known as 'MZ' (IMAGE_DOS_SIGNATURE). This value signifies the start of the IMAGE_DOS_HEADER which is also the base address of an image:Voila! The base address of ntoskrnl is discovered at 0xfffff802c2680000.
Creating the final version of the script
After some refactoring and code tidying (including error checking), we produce a much better version of the script. This does the same thing as the commands we inserted in the IDAPython "Output window":
Save a copy of the script to your local drive. We are then able to run it at any time by going to "File" and then "Script file..." in the IDA Pro GUI. A sample of the output is listed below:The important line appears on the bottom; the base address of ntoskrnl is displayed. It checks out with the work we did by hand too.
Loading ntoskrnl at its base address
We mustn't forget the final objective: loading kernel symbols. We're almost at the finish line. Let's tell IDA to load ntoskrnl at the base address our script found.First, we'll need to grab a copy of ntoskrnl on the VM. Don't use the version on your host as this may not match with what's on the VM. This'll be found in your guest's system directory:
You might need to resume your VM if you're currently active in IDA's GDB debugger by selecting "Debugger" and then "Continue process" (or by hitting F9) from the menu bar.
After you've pulled ntoskrnl from your VM, break into IDA's GDB debugger by selecting "Suspend". Now, we must load it by selecting "File" then "Load file" and finally "PDB file..."
Find where you copied ntoskrnl to on your host and use the address that the script found:
You'll know IDA's finished when the status changes to "AU: idle".
Quick validation
We should make sure that the symbols are loaded correctly. Navigate to "Jump" and then "Jump to address" (or press "G"). Enter PsLoadedModuleList (case sensitive) and hit "OK".
From there, double click the address immediately to the right of the PsLoadedModuleList symbol. This takes you to the first entry in the list.
Each entry in this list is of type _LDR_DATA_TABLE_ENTRY. You might be familiar with this structure from usermode programming. It's also used in the kernel.
We'll need to add the definition of the _LDR_DATA_TABLE_ENTRY to IDA's structures. Luckily, we have symbols loaded and this is a pretty straightforward process.
After the structure was added, you'll see a window similar to this.
Go back to the "Debug View". Impose the _LDR_DATA_TABLE_ENTRY structure on that memory region:
Let's follow the FullName.Buffer field:
And now let's convert this to a readable string:
You should see the characters \SystemRoot\system32\ntoskrnl.exe. We did it!
Final thoughts
Now that symbols are loaded for ntoskrnl, it would be wise to iterate through the PsLoadedModuleList and load symbols for all the other kernel mode components. This can be scripted using IDAPython too, however, it's beyond the scope of this article.
Cheers!
powerful article
ReplyDeleteVery powerful , useful , and detailed tutorial !!!
ReplyDeleteLove you guys. Brilliant article
ReplyDeleteI'am noob . tutorial is amazing and awsome!!! love you.
ReplyDelete