Special treatment for string, why?

5

I know that arrays are static elements used when you have a pre-determined size that can be used. But speaking of initialization, when the size is already set next to the array, I would like to know, essentially, why you can not use other types of arrays like C (pointer notation) strings. For example I can write:

#include <stdio.h> 

int main(void)
{
   char *sstring = "Olá, Mundo!";

   char schars[] = {'O', 'l', 'a', '
char *sstring = "Olá, Mundo!";

char *schars = {'O', 'l', 'a', '
#include <stdio.h> 
#include <stdlib.h>

int main(void)
{
   int arrayInt[][3] = {{1, 2, 3}, {4, 5, 6}};

   char *arrayChar[] = {"PALAVRA", "teste", "HEY"};

   char **names = malloc(3 * sizeof(char *));   

   *names = "Teste";
   *(names + 1) = "de";
   *(names + 2) = "Arrays";


   printf("%d\n", arrayInt[1][2]); 
   printf("%c\n", arrayChar[0][4]); 

   printf("Nome: %s %s %s\n", *names, *(names + 1), *(names + 2)); 

   return 0;
}
'}; int *mnumbers = {1, 2, 3, 4, 5};
'}; int mnumbers[] = {1, 2, 3, 4, 5}; printf("Sstring : %s\n", sstring); printf("Schars : %s\n", schars); printf("Mnumber : %d\n", *mnumbers); return 0; }

But on the other hand, I can not write:

#include <stdio.h> 

int main(void)
{
   char *sstring = "Olá, Mundo!";

   char schars[] = {'O', 'l', 'a', '
char *sstring = "Olá, Mundo!";

char *schars = {'O', 'l', 'a', '
#include <stdio.h> 
#include <stdlib.h>

int main(void)
{
   int arrayInt[][3] = {{1, 2, 3}, {4, 5, 6}};

   char *arrayChar[] = {"PALAVRA", "teste", "HEY"};

   char **names = malloc(3 * sizeof(char *));   

   *names = "Teste";
   *(names + 1) = "de";
   *(names + 2) = "Arrays";


   printf("%d\n", arrayInt[1][2]); 
   printf("%c\n", arrayChar[0][4]); 

   printf("Nome: %s %s %s\n", *names, *(names + 1), *(names + 2)); 

   return 0;
}
'}; int *mnumbers = {1, 2, 3, 4, 5};
'}; int mnumbers[] = {1, 2, 3, 4, 5}; printf("Sstring : %s\n", sstring); printf("Schars : %s\n", schars); printf("Mnumber : %d\n", *mnumbers); return 0; }

Even though the size is known, after all I'm initializing the arrays. Why does this happen? Why, even in an array of chars initialized with parentheses, is it not possible to treat them as pointers?

This occurs even with arrays and larger-sized (obviously) pointers:

%pre%

It is not possible to do int **arrayInt = {{1, 2, 3}, {4, 5, 6}}; and it is still necessary to inform one dimension of the array, even by explicitly declaring it. A char *arrayChar[] = {"PALAVRA", "teste", "HEY"}; mixed form is still possible, but char *arrayChar[] = {{'O', 'l', 'a', '[]'}, {'M', 'u', 'n', 'd', 'o', '{}'}}; is not. It seems that the use of %code% brackets is linked to initialization with %code% keys. I wanted to know the reason.

    
asked by anonymous 21.05.2015 / 15:12

1 answer

5

The quick answer is that yes, C has special handling for strings, "because yes."

The long answer is that you are assuming vectors and pointers of C are fully interchangeable in C, which is not true! What happens is that in C is that on several occasions there is an automatic vector conversion to a pointer to the first element of the vector. In your question, there are two places that this difference appears:

1) You need to have a vector by allocating memory.

Strings in C are a special case. When you use a string literal the C compiler will allocate a memory space in the data (read-only) area of the executable. It can also do optimizations such as allocating two string literals of the same content in one place.

Now for non-string types and even for mutable vectors containing characters, you will need to allocate a vector somewhere to store your data. The C compiler is not going to put them somewhere special for you.

2) Pointer to pointer and multidimensional vector is not the same thing.

Take the 3x3 matrix as an example

int mat[3][3] = {
  00, 10, 20,
  30, 40, 50,
  60, 70, 80,
};

The representation in memory is a vector of 9 elements, with one line after the other.

mat --> [ 00 10 20 30 40 50 60 70 80 ]

And when you access mat[i][j] , the compiler takes the 3*i + j element for you. Notice that he needed to know the number of columns in each row to be able to do this.

An array using int ** will have to be stored differently.

// Acho que precisa de C99 pra compilar isso aqui.
// Mas se não rodar dá pra ter uma idéia...

int linhaA[] = {00, 10, 20};
int linhaB[] = {30, 40, 50};
int linhaC[] = {60, 70, 80};

int *linhas[3] = {linhaA, linhaB, linhaC};

int **mat = linhas; // Aqui ocorre um cast automático de tipo

That in memory appears as

mat --> [linhaA, linhaB, linhaC]
           |       |       |
           |       |       +-----> [60, 70, 80]
           |       +-------------> [30, 40, 50]
           +---------------------> [00, 10, 20]

Note that we have a vector of pointers and that the data need not be in a vector only or even with the rows in order. In order to access mat[i][j] we really only do two overflows one after the other.

At the end of the day what all this means is that a int [3][3] (3-dimensional integer 3x3 vector) can be automatically converted to a int (*)[3] (pointer to 3-integer vector) but can not be converted to int ** (pointer to integer pointer). The root of all this is that when we use a two-dimensional vector the compiler needs to know the size of all vector dimensions (except the left most - the number of lines) to be able to access an element.     

21.05.2015 / 15:56