Assembly

You will probably never need to write a line of assembly code in your professional career. It is only a select few low level device driver developers who today have to craft and optimize assembly code by hand anymore. You might think then that learning assembly would be a waste of time, nothing could be farther from the truth.

No matter what language you develop in, when it is all said and done it gets compiled down to machine instructions which get executed by the processor. Whether it’s Java, Smalltalk, C, C#, C++, Objective C, Perl, PHP, Javascript, or anything else, it all boils down to machine instructions.

You will certainly find yourself at times trying to debug a problem without having access to source or symbols. You might even find yourself engaged in low level security profiling or forensic analysis. Possessing a solid foundation in assembly will set you up for success. The deeper your knowledge of the entire computing stack, the greater your advantage.

Assembly really is at its core a very simple programming language. Control structures all boil down to GOTO statements and datatypes are all byte strings of various lengths. Processors spend most of their lives doing simple math combined with loading and storing to memory. Program flow control is effected by conditional and unconditional jumps (GOTOs) .

Because of this simplicity, assembly occupies a place in the programming universe that is far away from most problem domains. The point of any programming language is to make it easier for humans to express abstract problems in concrete terms that computers can then execute to accomplish the solution. Assembly certainly is easier than writing out the raw byte values that a processor knows about, but only just. Languages like Java and C# make it relatively easy for developers to use convenient complex data structures and sophisticated program flow control techniques to quickly express their problems in a way that computers can solve. Compilers take this human friendly language and convert it into either an intermediate form like byte code or into a final form like machine code. Byte code itself is ultimately converted into machine code either by an interpreter or just in time compiler.

Understanding how complex data structures, scoping , and flow control constructs map from the high level languages down to machine language helps increase your design and debugging efficiency immensely. Having a good sense of how things actually reduce to practice will help guide your decision making and make you a more valuable and efficient developer. The goal here isn’t so much knowing how to craft systems using assembly but instead knowing how to read disassembly output and gain meaningful insight of how code executes in the real world on the bare metal. To learn though, you must do and so I will focus on writing small assembly programs as well as reading the assembly output of compiled C programs.

Assembly is best learned by example. In the following pages I illustrate core concepts using Microsoft’s macro assembler that comes installed with either the Windows Driver Development kit or Visual Studio (including the free editions). I prefer to use the Windows Driver Development kit and its build system so will be illustrating examples with these tools. The assemblers are the same between both systems but the build process is somewhat different. You should use whatever toolset you feel the most comfortable with. You will also need the Windows native debuggers (both x86 and x64 versions if you are on a 64bit computer) which can be downloaded as part of the WinDDK installation.

  1. Anastasia
    March 19, 2012 at 9:58 am | #1

    A better magazine theme will make the blog looks nicer:)

  2. Anonymous
    May 11, 2012 at 8:44 pm | #2

    Great article, thanks for sharing.

  1. No trackbacks yet.

Leave a Reply

Your email address will not be published. Required fields are marked *