I have written a program whose purpose is to read a compiled binary executable file for the x86 (intel) architecture and interpret the assembly code contained therein by executing it instruction by instruction. The part of reading the executable, extracting the sections and creating a virtual memory that includes the executable code works without problems and could execute some very simple programs (example: int main() {return 0;}
).
To decode the instructions I'm basing myself on the intel manual (in English). Additionally I am using the objdump -d
utility to view the disassembly of the executable to compare with my results.
My problem is decoding the following sequence of bytes: (hexadecimal)
67 89 04 18
objdump
correctly states that this means:
mov %eax, (%eax, %ebx, 1)
My problem is when I do the process manually based on the manual:
67
: Address size change prefix; 89
: Opcode the mov
of a record to a memory / record; 04
: ModR / M byte to indicate that the first argument is %eax
, the need for a SIB and that displacement is zero; 18
: SIB byte indicating that the last argument is %eax+%ebx
. The detail is that both ModR / M and SIB are considered in 32-bits. It means that at this stage the operand size and address size are 32-bits. However, the address size change prefix needed to be used, which means that the original instruction (without the prefix) is 32-bits in the operand and 16-bits in the address. Is this correct?
How can a 32-bit operand and 16-bit address instruction exist? I tried compiling a code with a statement like this using gas
(GNU Assembler) and it returns an error stating that combination is impossible. Why then is the pattern?