What are unions? Why use them inside structs?

13

I would like to understand the differences between union and struct . I noticed that both can be accessed in the same way:

u.membro
uPrt->membro

s.membro
sPrt->membro

In practice I've seen several codes using unions within structs . What is the advantage in doing this? Is there any improvement in performance / memory?

An example of any code (correct me if I'm wrong):

struct pessoa {
 char[50] name;
 union {
  int idade;
  float peso;
 }
};
    
asked by anonymous 13.01.2015 / 12:53

3 answers

17

The great advantage is in the organization of memory, and in its reuse.

Variables in a struct are organized into sequential addresses, so that each variable that makes up struct lies side by side in memory.

Your example is not a good example for a union , so I will not use it.

Imagine that we have a grocery item. This item has a name, price, and dimension. The size can be either in volume (1 liter) or weight (1 kg). So we could create the following struct :

struct item {
    char nome[50];
    float preco;
    float volume; // em litros.
    unsigned peso; // em gramas.
}

In this struct item , we would have allocated memory as follows (I'm guessing memory byte alignment, for simplicity):

0-------------49-50-------53-54--------57-58-------61
      nome          preco       volume       peso

Note, however, that in the case of milk, we do not buy milk by weight, but by volume. Therefore, the struct item.peso field would not have a valid value for this item, but would always occupy memory.

The same goes for cheese: it is sold in grams, not in liters.

How can I reduce the memory used? We can declare within union the fields volume and peso :

struct item {
    char nome[50];
    float preco;
    union {
        float volume;
        unsigned peso;
    }
}

Now our memory layout will be:

0-------------49-50-------53-54-------------57
      nome          preco       volume/peso

In this way, when we access the struct item.volume field, the compiler knows that we are treating that memory region as a float , and will handle it correctly. The same goes for when we access struct item.peso , it knows that it is a unsigned , and will apply the unsigned rules.

But, what if we do:

struct item it;
it.peso = 2;
it.volume = 0.0f;
printf("%u", it.peso);

The output will not be 2 , which is the value we put in the peso variable, but the binary value of 0.0 in IEEE 754 interpreted as a unsigned . Coincidentally, this value is also 0 , and therefore the output will be 0 .

Why?

Remember that the volume and peso fields occupy the same memory region. So the assignments wrote at the same address.

So if we access the value by the "wrong" field, we can get absurd results for our mastery of the problem. So how do you know which field to use?

We can add a flag indicating this:

struct item {
    char nome[50];
    float preco;
    bool porVolume;
    union {
        float volume;
        unsigned peso;
    }
}

And so if we wanted to print the contents of an item, we could use:

if ( it.porVolume ) {
    printf("%s\t%.2f\t%.3f", it.nome, it.preco, it.volume);
} else {
    printf("%s\t%.2f\t%u", it.nome, it.preco, it.peso);
}

And this pattern is repeated for when we access the fields of union .

In addition to using within struct , we can use union as a type itself:

union pesoVolume {
    float volume;
    unsigned peso;
}

union pesoVolume pv;
pv.volume = 0.0f;

The operation is identical except that union will no longer be within struct .

In language C, the fields that make up union 's can have different size, inclusive, and the compiler will reserve memory identical to the size of the largest variable. That is:

union u1 {
    float f1;    // 4.
    unsigned f2; // 4.
}
printf("%d", sizeof(union u1)); // 4.

union u2 {
    float f1;    // 4.
    long int f2; // 8.
    char f3[20]; // 20.
}
printf("%d", sizeof(union u2)); // 20.

When to use?

Today does not make much sense anymore, I think. In the past, memory was a non-abundant resource, and therefore justified making these savings. Today, the standard of a PC is 4GB, and it is not uncommon to find machines with 8GB or more.

In some cases, however, union facilitates the passing of parameters in an API, and can be used if the programmer identifies the advantage. This occurs in some Win32 commands. I sincerely do not recommend it, as it may be that some languages do not support this organization, causing interoperability issues.

Why did not I use your example?

We would probably like to keep both the idade information for a person and your peso .

    
13.01.2015 / 13:36
7

The difference between union and struct is that a struct is an "AND" and stores all fields while a union is an "OU" and all fields are in the same memory location. If you update a field of a union all other fields will also be updated at the same time, to some junk value.

The only case in which you should use a union is when you want to save memory and you are sure that only one field is needed at a time. Already for the fact that unions appear inside structs, a problem of the unions in C is that there is no way to know which field is being used and which fields are with "garbage" values. So it's common to create an enum to mark this. For example, this struct represents tokens in a programming language:

struct exp {
    enum {LIT,VAR} type;
    union {
        int lit;
        char *var;
    } value;
};

A token can be either a number or a variable name. In the type field we say what kind of token and in the value field we store the value of the token (an integer in case it is a numeric token and a pointer to the string if the token is an identifier). Using union, struct exp has a more compact representation in memory. You only have to be careful to only access the field lit after chacar that the type field contains LIT and so on.

    
13.01.2015 / 13:36
4

Vinícius Gobbo and hugomg have already explained what a struct is and how it differs. However, I'd like to add an example usage.

union Valor
{
    uint32_t dword;

    struct
    {
        uint16_t word0;
        uint16_t word1;
    };

    struct
    {
        uint8_t byte0;
        uint8_t byte1;
        uint8_t byte2;
        uint8_t byte3;
    };
};

Which can be represented as follows:

Intheimage,youcanseethatword0andword1areinparalleltodword,aswellasbytes.

Inthisexample,youcandooperationsofthefollowingtype:

Valorfoo;foo.dword=305;printf("%d", foo.word0); // Mostra a primeira word de dword.
printf("%d", foo.word1); // Mostra a segunda word de dword.

printf("%d", foo.byte0); // Mostra o primeiro byte de dword.
    
13.01.2015 / 14:07