Store values representing a user's 'sex' in databases

9

I have some doubts about the representation of some fields in the database. How do I store fields as "sex" in tables in the database? In the literal form "masculine" and "feminine" or through numbers that represent each sex as "1" and "2", respectively? This doubt extends to other fields, for example, there is a need to store the weight classification of an object in "very light", "light", "medium", "heavy" and "very heavy". Should I store a representative number for each category?

    
asked by anonymous 08.04.2015 / 14:04

3 answers

8

It depends on each case.

What I never recommend but there may be exceptions is to use the full description. It will not be wrong to use the description if you have a good reason for it. If you do not use the description, and there is one that should be used in some situations you will have to treat it in some way.

In many cases a code is enough. Even by external force, legislation for example, it may be that the correct one is to put 1 or 2 and that's all that matters. In another one you must put M or F and need no description beyond this. So you should just use this. That is, the column is already self-descriptive this way.

But you may need an extra description. There are three solutions:

  • You allow for a stored procedure to resolve this or for the application to handle this as this description does not usually change. Many people will criticize this because the database is not being used properly. I agree with this, but I make the caveat that there may be reason, especially to let the application handle this when the description may change depending on the execution context. Probably will not occur with sex but may occur with other types.
  • Create an auxiliary table with the descriptions and make a relationship. A very common solution. This solution is highly recommended. Contrary to what many forget is still another point of the database or application to know how to manipulate this information and adequately if the interest is to receive the description of the data. Of course in this case a 1 or 2 will be ID for the referenced table. This ID is usually a number but does not have to be, it can be a character and use M or F .
  • Create an enumeration if the system you use allows. Some criticize its use because the resource may have limitations. But it can solve well in most cases. In the background is a standardized way of the database to use an auxiliary table. You will still have to do something in the data recovery to get the description.
  • As I said, I find it more organized and flexible in this way, especially the number 2, so is the normalized form. But if you want to make it easier to query, give a little more performance, then playing the direct description is the solution, so you get what you want without having to use artifacts in query .

    But remember this may bring some headaches if you have any reason to change these descriptions. Nothing that can not be solved in cases of data as simple as described. In more complex data it may be that you can never disambiguate if there is a change. Do not do this with city names, for example.

    Just will not use Boolean. I've seen this when there were only two values and this is wrong.

    You have a ISO on the use of sex .

        
    08.04.2015 / 14:24
    1

    In the case of fixed information, such is the case. There are only 3 possibilities:

    • Male
    • Female
    • Undefined (in case your system allows the person to not define sex).

    So, this data is fixed and does not need a table in your database, that would be a complete waste of resources.

    Create a class with constants and that's it.

    <?php
    abstract class Sexo
    {
        const Indefinido = 0;
        const Masculino = 1;
        const Feminino = 2;
    
       public getStringValue($value)
       {
           switch($value)
           {
               case self::Indefindo:
                   return "Indefinido";
               case self::Masculino:
                   return "Masculino";
               case self::Feminino:
                   return "Feminino";
               default:
                   return "Valor não é válido";
           }
       }
    }
    ?>
    

    When you need to use just do the following:

    <?php
        $meuSexo = Sexo::Masculino;
        //var_dump($meuSexo) => 1
    
        $string = Sexo.getStringValue($meuSexo);
        //var_dump($string) => Masculino
    ?>
    
        
    28.05.2015 / 21:00
    -1

    In order to curiosity:

    In case of weight of an object in "very light", "light", "medium", "heavy" and "very heavy". We can use some statistical methods to define a range.

    So if we have 50 products, with the following weight list:

    1, 1, 1, 3, 4, 4, 5, 5, 6, 6,
    7, 7, 7, 7, 7, 8, 8, 8, 9, 9,
    10, 10, 10, 10, 11, 15, 16, 16, 16, 17,
    18, 18, 18, 18, 18, 40, 40, 40, 40, 40,
    47, 47, 49, 49, 50, 55, 58, 59, 90.

    We have the highest value as 90 and the smallest as 1, so our sampling amplitude is AA = 89 (90-1).

    Then we calculate i, which theoretically would be the number of clusters we need to have: i = 1 + 3.3 log 50 = 6.6 ~ 7 (normal rounding).

    So the ideal would be to have 7 weight classifications, but for this example we will use the 5 already classified by you, so consider i = 5.

    Now we need to calculate what the intervals will be. AA / i = 89/5 = 17.8 ~ 18 (always rounded up).

    Very Light: 1 | - 19
    Light: 19 | - 37
    Average: 37 | - 55
    Heavy: 55 | - 73
    Very Heavy: 73 | - 91

    <?php
    abstract class Peso
    {
       public getStringValue($value)
       {
           if ($value < 19){
              return "Muito Leve";
           }elseif($value < 37){
              return "Leve";
           }elseif($value < 55){
              return "Médio";
           }elseif($value < 73){
              return "Pesado";
           }else{
              return "Muito Pesado";
           }
       }
    }
    ?>
    
        
    07.04.2016 / 14:11