Use VARCHAR (255) for all fields or choose the best size for each field? [duplicate]

12

Assuming the following fields (for example): Nome , Rua , Cidade , UF .

The Nome would be the name of the client. There are people with 2 short names or people with 5 or more names, so it is possible to have a "João Silva" or a "Maria Clara Ramos Santos da Silva", that is, people with 10 characters and people with 27 characters more).

The Rua would be the street name of the client: Same as in the previous example.

The Cidade would be the client city: The same as the previous example.

The UF would be the client state: Since I would only like the acronym, it will not have been with 5 or more letters, only 2.

Is it worth putting VARCHAR(255) to all or is it better to choose the best size for each field? When I say "best," I'm referring to not spend unnecessary space, type having a 255-character field and only storing values with only 2 letters (UF). I read in other answers that VARCHAR() is a dynamic format, the value passed between parentheses is only the maximum value that field will support, is this true? If it is, then it is worth putting VARCHAR(255) for all fields, since the value will be dynamic with a maximum of 255, so the maximum is 255, however it will only have 2 letters so it will only store the space required for 2 letters and not 255.

"Choosing best size" = Nome VARCHAR(100) , Rua VARCHAR(120) , Cidade VARCHAR(100) , UF VARCHAR(2) .

I had this doubt because I'm about to create a database for a MySQL system, but my question is about the database in general, not just MySQL.     

asked by anonymous 07.05.2018 / 02:11

3 answers

10

Contrary to what you are imagining VARCHAR does not mean that it will have 255 characters. It will occupy essentially the required quantity. It will not be the same amount of bytes because the encoding can have characters with more than one byte and has a few bytes of control.

The CHAR occupies the size you set. And there is no additional cost of control. The state would be CHAR(2) . I prefer it whenever the data is of known size.

I use VARCHAR with a maximum number of characters that that column can have. Although I guarantee in the application that the maximum size will always be respected, it is good to have a trigger if you escape some programming error. A name I would use 60 or something, not because it spends more put 255, just because I see no advantage in it and has at least one disadvantage. So valid is, well no.

Some databases may be slightly different, but worth more or less the same.

If you are too lazy to think of the most appropriate number then my advice is to look for another profession:)

If you are in other answers and they have several votes, it is an experienced user, it is almost 100% certain that it is true.

    
07.05.2018 / 03:07
4

This response is specific to SQL Server.

In the SQL Server documentation for the data type varchar (n) states that" The storage size is the actual size of the inserted data + 2 bytes ". That is, if in a given row the size of the string to be stored in the column is 12 bytes, regardless of whether the column is declared as varchar (20) or varchar (8000) in the database that information will occupy 14 bytes. Assuming a column whose content has a maximum of 60 characters, then either declare varchar (60) or varchar (255).

However, when the SQL Server query optimizer generates the query execution plan, and certain actions do occur, it is necessary to estimate how much memory should be previously allocated for the query execution - memory grant -, and one of the parameters for this calculation is the line size. In the case of rows in which there are columns of variable size, the manager does not keep information about the average size of the information recorded in the column and then uses arbitrary value, defined from the maximum size that the column allows. For example, in the case of a column declared as varchar (255), the query optimizer considers half plus 2 bytes, 128 + 2, as the value to be used in calculating the estimated row size in memory. That is, it will allocate 130 bytes even if the column size can be up to 60 characters long. Here it is already realized that more memory will be allocated than necessary.

So in the case of SQL Server always declare the correct size; do not overestimate.

In addition to the memory grant of the execution plans of certain queries there are other situations where overestimating the size of the columns of variable size brings disadvantages. In article There is detailed explanation of the negative impact of overestimating the size of columns of variable size, including with images of the execution plans.

>     
07.05.2018 / 13:40
0

For fields where you have an obvious notion of how many characters you are going to receive, it is good to use CHAR(numero de caracteres) , but first check the type of character that will be received, whether it is integer type, text, date, among others. Also, it is good to pay attention to these details and thus you will save bytes in your DBMS and in your Databases (Schemas).

comments :

  • CHAR occupy the entire number of characters chosen;
  • VARCHAR occupies only the entered characters, eliminating the amount that stays vacant, but occupy some bytes because this amount becomes vacant.
07.05.2018 / 05:15