At the end of the day it is the compiler who decides and just looking at the source code you can not complete anything. If you want to write a tool that deduces this automatically, you need to analyze a compiled program. GCC, for example, causes all stackframes to be aligned in multiples of 16 by default by adding paddings. A more secure way is to compile the code for assembly (using -S
) and parse from there.
First, I modified your code like this:
main(int y) {
int j=0;
f(j+1);
}
f(int x) {
int i=0;
asm("#BREAK POINT"); // Insere um comentário no código assembly. Isso faz nada.
return i+1;
}
Just to know at what exact point we want to analyze the stack. I also put a value in the variables, otherwise the program would have undefined behavior and anything would be invalid. Compiling then:
main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
subl $32, %esp
movl $0, 28(%esp)
movl 28(%esp), %eax
addl $1, %eax
movl %eax, (%esp)
call f
leave
ret
f:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
movl $0, -4(%ebp)
#BREAK POINT
movl -4(%ebp), %eax
addl $1, %eax
leave
ret
Let's look at it step by step. The stack grows from the bottom up, so let's assume that the stack starts at esp=320
. It is currently empty:
+-----------------------+ <- esp=320
The first step of the assembly is to save the place where the frame before the main begins in the stack: pushl %ebp
.
+-----------------------+ <- 312 (esp)
| 4 bytes: ebp |
+-----------------------+ <- 320
Then save the end of the current stack in esp
and subtract 32 bytes from rsp
. The and
that happens with -16
is meant to align the stack to a multiple of 16. So it looks like this:
+-----------------------+ <- 272 (esp)
| 32 bytes: nada |
| |
| |
| |
+-----------------------+ <- 304
| 8 bytes: alinhamento |
| |
+-----------------------+ <- 312 (ebp)
| 4 bytes: antigo ebp |
+-----------------------+ <- 320
The next statement is movl $0, 28(%esp)
. It places 0
in address esp+28
:
+-----------------------+ <- 272 (esp)
| 28 bytes: nada |
| |
| |
| 4 bytes: j=0 |
+-----------------------+ <- 304
| 8 bytes: alinhamento |
| |
+-----------------------+ <- 312 (ebp)
| 4 bytes: antigo ebp |
+-----------------------+ <- 320
In the sequence: movl 28(%esp), %eax
co_de addl $1, %eax
read the value of movl %eax, (%esp)
( j
), add one and put esp+28
:
+-----------------------+ <- 272 (esp)
| 4 bytes: arg0=j+1 |
| 24 bytes: nada |
| |
| 4 bytes: j=0 |
+-----------------------+ <- 304
| 8 bytes: alinhamento |
| |
+-----------------------+ <- 312 (ebp)
| 4 bytes: antigo ebp |
+-----------------------+ <- 320
Then you have the call of the function esp+0
. Note that call f
inserts the current code address so that call
can work. The function starts by inserting the old ebp into the stack, updating a new ebp and subtracting 16 from the stack. Then ret
is put in the stack in 0
( ebp-4
):
+-----------------------+ <- 248 (esp)
| 12 bytes: nada |
| 4 bytes: i=0 |
+-----------------------+ <- 264 (ebp)
| 4 bytes: 312 |
+-----------------------+ <- 268
| 4 bytes: enredeço |
+-----------------------+ <- 272
| 4 bytes: arg0=j+1 |
| 24 bytes: nada |
| |
| 4 bytes: j=0 |
+-----------------------+ <- 304
| 8 bytes: alinhamento |
| |
+-----------------------+ <- 312
| 4 bytes: antigo ebp |
+-----------------------+ <- 320
And we come to the break point. If you continue, you will see the stack undoing as the functions return.
As you can see, there is a lot going on underneath the rags. This simple example consumed 72 bytes, of which 44 were unusable. On the other hand, turn on optimizations and you will see that the same code consumes 0 bytes.
Your mistake: The return value is not always put in the stack. It is only when it is of a large object, such as a struct. A movl $0, -4(%ebp)
is merely returned directly to the register, in int
.