How to extract hexadecimal code from an executable compiled with nasm?

0

I have an executable, created in Assembly language and compiled with NASM.

Is there a way to get the value, in hexadecimal, of the bytes produced by the compiler, so that I can use them in a disassembler (ie discover the generated OP codes)?

Code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    FILE *file;
    char *buffer;
    unsigned long fileLen;
    file = fopen( "teste.o", "rb");
    if (!file) {
        printf("erro\n");
    }
    fseek(file, 0, SEEK_END);
    fileLen=ftell(file);
    fseek(file, 0, SEEK_SET);
    buffer=(char *)malloc(fileLen+1);
    if (!buffer) {
        fprintf(stderr, "Memory error!");
        fclose(file);
        return 0;
    }
    fread(buffer, fileLen, 1, file);
    fclose(file);

    for (unsigned int c=0;c&lt;fileLen;c++) {
        printf("%.2hhx ", buffer[c]);
        if (c % 4 == 3) {
            printf(" ");
        }
        if (c % 16 == 15) {
            printf("\n");
        }
    }
    printf("\n");
    free(buffer);
}
    
asked by anonymous 08.05.2014 / 08:24

1 answer

1

The language or compiler you used has little influence on the format of the final executable. If you're on Linux, it's likely to be a ELF (Executable and Linkable Format) . Already in Windows, it will be a PE (Portable Executable) . Knowing the format of your executable (you can also write code that can extract data from both (or other) formats, just check the magic bytes to differentiate) you need to extract the sections.

How this is saved in the file differs depending on the format, but there is a header with some general information like the architecture, the symbol table, and the section table. Scroll through the section table and check the flags of each. Compilers usually produce some sections that are neither code nor data, such as .comment . For the flags associated with each section you can identify those that contain code (it can be more than one).

So you will have a list of code sections where three information is important: The size in bytes of the section, the location in virtual memory (this will influence some instructions like CALL if you involve different sections ) and the offset in the file. The compiled machine code can be read directly from the executable file by reading size bytes from offset .

If you want to know function names or local variables you will also need the symbol table. This should help since the code alone can not separate functions clearly. Each symbol is associated with a section and points to a memory address where the function or variable begins.

    
09.05.2014 / 12:56