strcpy is merging numeric format with other chars

6

I do not know if I could make myself understood in the title, but when I use strcpy() to copy a char* to another when I put a format so "teste" it works normally, but when I put a string format 3 letters (digits in case), for example "2000" it ends up merging this value to the destination with the next value the next time I use strcpy() , it follows the code:

#include <string.h>
#include <stdio.h>
#include <stdlib.h>

typedef struct
{
    char nome[80];
    char ano[4];
    char diretor[80];
} Filme;

void analise(Filme *filme, const char *arg1, const char *arg2, const char *arg3)
{
    Filme _filme = {
        .nome    = malloc(2),
        .ano     = malloc(2),
        .diretor = malloc(2)
    };

    strcpy(_filme.nome, arg1);
    strcpy(_filme.ano, arg2);
    strcpy(_filme.diretor, arg3);

    memcpy(filme, &_filme, sizeof _filme);
}


void carregar(Filme filmes[])
{
    analise(&filmes[0], "E o Vento Levou", "1939", "Victor");
    analise(&filmes[1], "teste", "998", "bar");
    analise(&filmes[2], "Os Passaros", "1963", "Alfred Hitchcock");
}

int main()
{
    Filme filmes[1000];

    carregar(filmes);

    printf("\nmain:\n");

    printf("- Nome:    %s\n", filmes[0].nome);
    printf("- Ano:     %s\n", filmes[0].ano);
    printf("- Diretor: %s\n", filmes[0].diretor);

    printf("----------------\n");

    printf("- Nome:    %s\n", filmes[1].nome);
    printf("- Ano:     %s\n", filmes[1].ano);
    printf("- Diretor: %s\n", filmes[1].diretor);

    printf("----------------\n");

    printf("- Nome:    %s\n", filmes[2].nome);
    printf("- Ano:     %s\n", filmes[2].ano);
    printf("- Diretor: %s\n", filmes[2].diretor);

    return 0;
}

Please note that I ran this:

analise(&filmes[0], "E o Vento Levou", "1939", "Victor");
analise(&filmes[1], "teste", "998", "bar");
analise(&filmes[2], "Os Passaros", "1963", "Alfred Hitchcock");

When running the problem the output is this:

main:
- Nome:    E o Vento Levou
- Ano:     1939Victor
- Diretor: Victor
----------------
- Nome:    teste
- Ano:     998
- Diretor: bar
----------------
- Nome:    Os Passaros
- Ano:     1963Alfred Hitchcock
- Diretor: Alfred Hitchcock

See that in "And the Wind Took" and "The Birds" the years were mixed with the name of the director, 1939Victor and 1963Alfred Hitchcock , already in the case of:

analise(&filmes[1], "teste", "998", "bar");

It has the correct output. I understand that I should make the year with int , but I'm learning C and would like to better understand this part of memory, I assume it was some typing my .

    
asked by anonymous 02.06.2018 / 23:47

2 answers

5

This code has some problems.

  • You do not have to use malloc() if the area that string should be already reserved within the structure. You could even allocate it if you want, and maybe it makes sense for the names, but then you need to declare it as const char * and not [tamanho] . You have to be much more careful when you do this. What is even leaking memory. In great volume this would be tragic. And allocating only 2 characters where you need several also does not give much right, is that in this case coincidentally works.
  • Not that it's a problem, but I see no reason to create a local structure, initialize its members, and then copy to the array . Type directly into array , without creating anything intermediate, initialize or copy.
  • strings have a terminator, so you need to make room for it. The specific problem you are encountering is that it has 4 bytes reserved for the year, so the 4 characters of the year are placed there, and a 5th. is placed next. When you enter the director's name, your first character goes over the year's string terminator. Then when you read the year it has no end, it would only end at the end of the director's name, so it's all together. If it had 5 bytes, the terminator would be preserved and everything would work fine.

How it works:

#include <string.h>
#include <stdio.h>

typedef struct {
    char nome[81];
    char ano[5];
    char diretor[81];
} Filme;

void analise(Filme *filme, const char *arg1, const char *arg2, const char *arg3) {
    strcpy(filme->nome, arg1);
    strcpy(filme->ano, arg2);
    strcpy(filme->diretor, arg3);
}

void carregar(Filme filmes[]) {
    analise(&filmes[0], "E o Vento Levou", "1939", "Victor");
    analise(&filmes[1], "teste", "998", "bar");
    analise(&filmes[2], "Os Passaros", "1963", "Alfred Hitchcock");
}

int main() {
    Filme filmes[1000];
    carregar(filmes);
    printf("\nmain:\n");
    for (int i = 0; i < 3; i++) {
        printf("- Nome:    %s\n", filmes[i].nome);
        printf("- Ano:     %s\n", filmes[i].ano);
        printf("- Diretor: %s\n", filmes[i].diretor);
        printf("----------------\n");
    }
}
    
03.06.2018 / 00:02
3

What happens is that your struct occupies 164 bytes in memory:

  • 80 bytes for the name, including the null terminator;
  • 4 bytes for the year, You are not considering the null terminator here ;
  • 80 bytes for the director, including the null terminator.

The compiler will assign the following offsets to each of these fields:

  • nome : 0 offset bytes since the beginning of struct in memory.
  • ano : 80 bytes of offset from the beginning of struct in memory.
  • diretor : 84 bytes of offset from the beginning of struct in memory.

So, if we look at the first case, for example, the layout looks like this (where ∅ is the null terminator and é is garbage, which can have any value):

E o Vento Levou∅□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□1939Victor∅□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□

When you do this:

printf("- Ano:     %s\n", filmes[0].ano);

At run time, the generated code will load the memory address of filmes , add 164 × 0 to get filmes[0] (164 = sizeof Filme , 0 is the index) and then add 80 to get the position of filmes[0].ano (80 is the position of ano within struct ). From this position it will print a string (due to %s ) that starts at that computed position and ends at the first null terminator found . It happens that the year does not have a null terminator, so it will eventually invade the subsequent memory region of the diretor field.

Furthermore, this code does not make any sense:

Filme _filme = {
    .nome    = malloc(2),
    .ano     = malloc(2),
    .diretor = malloc(2)
};

What you wanted was probably this:

void analise(Filme *filme, const char *arg1, const char *arg2, const char *arg3) {
    strcpy(filme->nome, arg1);
    strcpy(filme->ano, arg2);
    strcpy(filme->diretor, arg3);
}

To solve your problem without using int for the year, one possibility is to change the size of the ano field to 5. If the other fields must be a maximum of 80 characters instead of 79, then change the size them to 81 to make sure the null terminator is there.

Another alternative that avoids having to resize the fields is to specify a maximum length for strings in printf :

printf("- Nome:    %.80s\n", filmes[0].nome);
printf("- Ano:     %.4s\n", filmes[0].ano);
printf("- Diretor: %.80s\n", filmes[0].diretor);
    
03.06.2018 / 00:19