A while ago someone asked how to encode/execute x86 machine code, I replied with a small tutorial on the subject, which I thought I should expand and share here
X86 instructions can range from 1 byte to 15 bytes long, an example for 1 byte instruction is the NOP (no operation instruction 10010000b or 0x90h), in general/from a bird's eye view x86 instructions format looks like this:
byte: 0,1,2,3 4,5 6 7 8,9,10,11 12,13,14,15 func: prefix opcode reg/mem scaled indexed mem displacement imm data
*Although this is 16 bytes long, the actual instruction can not exceed 15 bytes because some bytes are mutually exclusive.
Now lets encode a simple x86 instruction "mov $0x8888 , %eax". Depending on the operands, memory addressing scheme etc... the MOV instruction can vary, here we want to move a 2 byte immediate operand (0x8888 ) to register EAX which takes the following format:
opcode 8/16bit eax byte0 byte1 1011 1 000 10001000 10001000 or 0xB88888
Now comes the fun part, executing the instruction :) I will use a simple C program to call the code, and watch the results with gdb, there are two ways to call this code, the first and easy way is to just call it ! casting the buffer to a function pointer and calling it.
The second way, which is harder but more educational, involves poking around the stack a little bit, I won't fully explain it, but basically, we will call a function, which will change its own return address on the stack to the address of the opcode so that the execution continues at the buffer (think buffer overflow):
char *opcode = "\xB8\x88\x88"; void run() { long *ret; ret=&ret+2; /*return address on stack*/ *ret=(long*)&opcode; /*now run() will return to opcode*/ } main() { //((void(*)(void))opcode)(); run(); }Compile with:
gcc mov.c -o mov -ggdbThis is the complete gdb session:
(gdb) mov //run with gdb (gdb) break run //set break point at run() function (gdb) display /i $pc //add a display to see the inst mnemonic (gdb) run //run the program (gdb) nexti // skip instructions until you see ret (gdb) nexti 0x0804835e in run () at mov.c:7 0x804835e <run+26>: ret //return to the opcode address (gdb) nexti 0x0804954c in opcode () 1: x/i $pc 0x804954c <opcode>: mov 0x8888,%eax //Finally our hand coded instruction (gdb) nexti //one more nexti to execute the instruction (gdb) info registers //dump registers eax 0x8888 34952 //and eax now holds 0x8888 ! ecx 0xbf877640 -1081641408 edx 0xbf877620 -1081641440
That's it for today, I hope this small tutorial has inspired you to start experimenting yourself.
Let me raise all the hats I've for you. Rabena yekremak and please keep up the impressive work. My name is Ahmed Abdalla and I am impressed :)
ReplyDeleteAhmed,
ReplyDeleteI'm glad to see that someone finds my ranting useful :)
Thanks for the nice comment.
Nice! Your blog is a gem; glad I stumbled upon it. Just curious - is your background in Electrical/Electronics Engineering?
ReplyDeleteThank you! and no, computer science.
DeleteAh, I see. I asked because you have a quite a few articles on electronics. :)
Delete