A First Example Deconstructed
Here we will use the Windows native debugger windbg to explore the internals of our first example as it executes on real hardware. This is where the value of understanding assembly will begin to pay off. The goal of these articles is to equip you with enough understanding of assembly to assist you as you debug code down at the lowest levels. I assume some familiarity with the native debuggers but even if you have never used them you can still follow along. I will explain each command as I use it.
If you have not already, build the project by following the instructions in the README.txt file. You will also need windbg . Follow the installation instructions depending on which toolkit you decide to download.
We’ll start by launching the application using windbg. If windbg is not in your PATH variable you will need to specify the full path to it. Remember to use the 32 bit (x86) version of windbg since we built explore_masm for 32 bit processors. This will simplify the debugging process.
- windbg “<full path to explore_masm.exe>”
- Once windbg has launched, set the breakpoints so that execution stops in the functions we are interested. We can’t simply step from startup because there is a significant amount of Windows specific application setup that happens before your main function is ever called. Enter the following in the command line at the bottom of windbg. (In the figure below the command line is highlighted by a yellow box)
bp wmainbp means set a break point. Here we set a break point for our wmain function which is our application’s entry method. An important side note: to get help in any of the Windows native debuggers simply type .hh <name of command you want help on here> in the command text box. For example .hh bp . This will launch the help file viewer and route it to the command if it can find it. The Windows debugger help is very well written, informative, and well worth your time to read.
With our breakpoint set we can go at this point but first let’s examine the disassembly for the functions we are interested in. Examining the disassembly is something you will do a lot of as a developer debugging down at the lowest levels. To examine the disassembly for a function use the uf command (un-assemble function).
0:000> uf wmain
explore_masm!wmain [c:\users\briant\downloads\explore_masm1\explore_masm\i386\explore_masm.asm @ 49]:
49 007f11c7 55 push ebp
49 007f11c8 8bec mov ebp,esp
49 007f11ca 83c4f8 add esp,0FFFFFFF8h
52 007f11cd 8d45f8 lea eax,[ebp-8]
52 007f11d0 50 push eax
52 007f11d1 e8daffffff call explore_masm!setvals (007f11b0)
52 007f11d6 83c404 add esp,4
57 007f11d9 ff75f8 push dword ptr [ebp-8]
57 007f11dc 6800207f00 push offset explore_masm!szFmtStr (007f2000)
57 007f11e1 e8be020000 call explore_masm!printf (007f14a4)
57 007f11e6 83c408 add esp,8
63 007f11e9 ff75fc push dword ptr [ebp-4]
64 007f11ec 6800207f00 push offset explore_masm!szFmtStr (007f2000)
70 007f11f1 e8ae020000 call explore_masm!printf (007f14a4)
71 007f11f6 83c408 add esp,8
75 007f11f9 c9 leave
75 007f11fa c3 ret
- Don’t worry if this looks like meaningless debugger spew, we’ll cover this one line at a time to clarify what each instruction does.First let us examine the anatomy of the output. The output is broken up into 4 or 5 columns depending on the instruction.The first column is the line number that the instruction relates to in the source file. This only applies if you the have full program database file(PDB). Because you built this project locally, you will have the full PDB files for this example. Code that exists on customer servers or code you don’t own (and therefore do not build) will probably not have the full PDB files available. Vendors typically only expose public symbol files. Public symbol only PDB files will lack source line number information and lack information about function local variables.The second column is the memory address for the instruction. This is the value that the instruction pointer will have as this line of code executes. You can use this value to set the instruction pointer when moving it around to either jump over or back to code.The third column is the raw bytes for the instruction, these are what the processor sees when loading this instruction. To make sense of this you need to reference the Intel x86 Software Developers manual. These raw bytes can be useful when instructions have multiple formats or special modifiers and you need to know the specifics.The fourth column is the mnemonic for the instruction. This is the name of the instruction you use when coding assembly by hand. You can use the x86 software developers manual to look the mnemonic up and learn the specifics of what each instruction does.The fifth column is the the argument(s) for the instruction. The arguments depend on the instruction being executed.Here’s a color coded example with a breakdown of each column:
49 007f11ca 83c4f8 add esp,0FFFFFFF8h
49 -> This is the line number
007f11ca -> Instruction pointer address for this line of code
83c4f8 -> Raw bytes for this instruction (what the processor actually sees for the instruction in question)
add -> Mnemonic for this instruction
esp,0FFFFFFF8h -> Arguments for the instruction
Now that we know the format of the disassembly output, let’s examine the wmain function line by line to see what each instruction does. In the following lines I will show only the mnemonic and arguments for clarity. Your line numbers and instruction pointers will vary and the instruction bytes while occasionally useful during debugging, serve no purpose in this exposition.
WMAIN line by line
Every function has a prologue which sets up the frame pointer and local variables on the stack (if any). Here is the prologue for our wmain function:
- push ebp
Here we push the current value of the base pointer on to the stack. This is the callers stack frame base pointer and we want to preserve it before we set up the callees frame pointer.
- mov ebp,esp
This instruction sets up the called functions frame pointer by saving the current stack pointer into the ebp register. Remember that ebp is a purpose specific register used to hold the current functions frame pointer (top most memory address of the function’s stack).
- add esp,0FFFFFFF8h
Next we add -8 to the stack pointer,why? First we need to remember that functions always store their local variables at the top most area of their stack (just below where the base pointer points to). How do we know that 0FFFFFFF8h equals -8? Certainly we could use our hex and binary skills to divine this fact but a much quicker way to do this is to enter .formats 0FFFFFFF8h in the debugger command prompt. This will output the number in a variety of formats including the decimal format (with sign accounted for). So what is this 8 bytes for? Let’s look at the explore_masm.asm file and look at the local variables for the wmain function. The first (and in this case only) local variable is LOCAL mystrct:MYSTRUCTTYPE Examining this structure we see that it has two 4 byte elements so its total size is 8 bytes. The -8 bytes make sense now (remember the stack grows down from high to low addresses). The -8 makes space on the stack to hold our structure.
We are done with the function prologue and move on to the meat of the wmain function
- lea eax,[ebp-8]
In our source code the first thing we do in wmain is to setup some values in our structure by passing a pointer to the structure on to the function setvals . lea means “load effective address” and here we are loading the effective address of ebp – 8 bytes. We just pushed our 8 byte structure on to the stack by moving the stack pointer minus 8 bytes. We now take the address that points to the start of those eight bytes and put that value in the eax register which remember is a scratch register whose value does not need to be preserved during a function’s execution.
- push eax
Before we call setvals, we first have to push its argument on to the stack. The argument is the address we just calculated and put in eax above so we push the value in eax on to the stack with the push instruction. In x86 stdcall and cdecl calling conventions, we push the first four arguments (those that are 32 bits or less in width) on to the stack from right to left. We only have one argument here so we just push that argument.
- call explore_masm!setvals (007f11b0)
We are now ready to call setvals so that it can set the values in our structure. Here the call instruction is issued. The disassembler in Windbg includes the actual memory address, in parenthesis on the far right, with the symbol so that you can see at a glance where the processor is about to jump to. I wont trace execution into setvals for now, we’ll stay focused on the code stream in wmain for this lesson.
- add esp,4
All of the functions in our example are compiled as cdecl type calling convention functions. Normally Windows uses stdcall and we’ll see an example of those in a later lesson. The key difference between cdecl and stdcall is that the caller in cdecl is responsible for cleaning up the stack after the function call instead of the called function doing the cleanup. This is what we are doing here in the caller to setvals (wmain called setvals), we add back 4 bytes to the stack pointer to wipe off the 4 byte wide address we pushed on to the stack just before the call to setvals .
- push dword ptr [ebp-8]
We start setting up the stack for our first call to printf. This time we need to push two arguments on to the stack and so we must think about order. Arguments are always pushed from right to left on to the stack before the function is called. This makes sense when you remember that the stack grows from high to low memory addresses. This means that the leftmost argument starts at the lowest memory address in the range of addresses that hold a functions arguments. This is the expected order when you read the source code since we read from left to right (low to high addresses as we read the arguments for a function). This particular push is pushing the value of the first element in our structure because it is the second argument in our first call to printf
INVOKE printf, ADDR szFmtStr,mystrct.field1
This is also a good time to discuss why we call printf twice. Macro assembler offers a wide variety of macros that reduce the drudgery of doing things like loops, function calls, conditional branching etc. Typing out the actual instruction stream for commonly used constructs becomes tedious and error prone. Using a macro like INVOKE makes life more pleasant and lets you focus on solving your problem and not typing boiler plate code over and over again. Here by examining the raw assembly we can see how the INVOKE macro expands into actual code that the processor runs. Knowing how a macro is converted will guide your choices on when to use and not use them for highly optimized situations.
- push offset explore_masm!szFmtStr (007f2000)
The first argument to printf is the format string and this is the second argument we push on to the stack. Note that the offset keyword is used when referencing static variables that are defined in the source file at compile time.
- call explore_masm!printf (007f14a4)
With its arguments pushed on to the stack, it is now time to call printf and here we issue the call instruction to do just that. The call instruction pushes the address of the instruction right after it on to the stack (This will be the value that the ret instruction puts into the EIP register when the function returns). The processor then jumps to the address that is the argument to the call instruction.
- add esp,8
When printf returns we clean up the stack. Regardless of which default calling convention we use , printf is always called as a cdecl function. Any function that takes a variable number of arguments must be a cdecl function because only the caller knows how many arguments were passed so the caller must clean up the stack.
- push dword ptr [ebp-4]
In the source code we call printf using both INVOKE and by directly setting up the call ourselves. Here we start the second call (the one where we manually setup the stack ourselves). For this call we are using the second element in our structure and we push that value here
- push offset explore_masm!szFmtStr (007f2000)
As before we push the format string address on to the stack since it is our first argument to printf
- call explore_masm!printf (007f14a4)
Again we call printf
- add esp,8
Finally we clean up the stack from our printf call and we are now ready to exit our wmain function
The function is done executing. We need to execute the function epilogue and return to our caller (which in this case will cause the program to terminate since we are in the main function for the program).
Before we can return we have to execute the standard function epilogue. This amounts to copying the current stack base pointer into the stack pointer register then popping the value this points to into the base pointer register. This sets up the stack frame back to the caller’s stack frame. The hard way to do this is to issue two instructions:
mov esp, ebp
or we can simply issue the leave instruction which does this for us
The last thing we need to do is to tell the processor to start executing the instructions right after the call to the current function. Applications can not manipulate the EIP register directly but they can call ret which will pop the top most value off the stack into the EIP register which causes the processor to continue execution at the value in EIP. When we complete the leave instruction above, the value pointed at by ESP is the return address for the call to the current function that is now returning. This brings us to the end of our wmain and since wmain is the main function, returning from it will exit the program.