A while ago someone asked how to encode/execute x86 machine code, I replied with a small tutorial on the subject, which I thought I should expand and share here
X86 instructions can range from 1 byte to 15 bytes long, an example for 1 byte instruction is the NOP (no operation instruction 10010000b or 0x90h), in general/from a bird's eye view x86 instructions format looks like this:
byte: 0,1,2,3 4,5 6 7 8,9,10,11 12,13,14,15 func: prefix opcode reg/mem scaled indexed mem displacement imm data
*Although this is 16 bytes long, the actual instruction can not exceed 15 bytes because some bytes are mutually exclusive.
Now lets encode a simple x86 instruction "mov $0x8888 , %eax". Depending on the operands, memory addressing scheme etc... the MOV instruction can vary, here we want to move a 2 byte immediate operand (0x8888 ) to register EAX which takes the following format:
opcode 8/16bit eax byte0 byte1 1011 1 000 10001000 10001000 or 0xB88888
Now comes the fun part, executing the instruction :) I will use a simple C program to call the code, and watch the results with gdb, there are two ways to call this code, the first and easy way is to just call it ! casting the buffer to a function pointer and calling it.
The second way, which is harder but more educational, involves poking around the stack a little bit, I won't fully explain it, but basically, we will call a function, which will change its own return address on the stack to the address of the opcode so that the execution continues at the buffer (think buffer overflow):
char *opcode = "\xB8\x88\x88"; void run() { long *ret; ret=&ret+2; /*return address on stack*/ *ret=(long*)&opcode; /*now run() will return to opcode*/ } main() { //((void(*)(void))opcode)(); run(); }Compile with:
gcc mov.c -o mov -ggdbThis is the complete gdb session:
(gdb) mov //run with gdb (gdb) break run //set break point at run() function (gdb) display /i $pc //add a display to see the inst mnemonic (gdb) run //run the program (gdb) nexti // skip instructions until you see ret (gdb) nexti 0x0804835e in run () at mov.c:7 0x804835e <run+26>: ret //return to the opcode address (gdb) nexti 0x0804954c in opcode () 1: x/i $pc 0x804954c <opcode>: mov 0x8888,%eax //Finally our hand coded instruction (gdb) nexti //one more nexti to execute the instruction (gdb) info registers //dump registers eax 0x8888 34952 //and eax now holds 0x8888 ! ecx 0xbf877640 -1081641408 edx 0xbf877620 -1081641440
That's it for today, I hope this small tutorial has inspired you to start experimenting yourself.