Table Normalization for 2nd Normal Form

0
  

A relation is found in 2FN if and only if it is in 1FN and   contains partial dependencies.

From this definition I am normalizing a BD to 2FN. This BD has 2 tables, which illustrates the figure with their respective attributes:

Thecandidatekeycpfoftablecliente,iscausingpartialdependencyinthisformat,right?Becauseonlyidalreadyreference.

Rearrangingtheclientetableaccordingto2FNIendedupdefiningitso,thecentralideaofmynormalizationthinkingiscorrect?

Isthereanybetterwayformetonormalizeit?

    
asked by anonymous 28.10.2017 / 06:44

3 answers

3

Your tables Cliente and Agência are in the first normal form because no field is multivalued, so let's focus on the second normal form.

There are two candidate keys in the Cliente table: id and cpf . Clearly from id , we can get any other field and from cpf also. And we can not use only a part of id or cpf for this purpose.

In the Agencia table, the only candidate key is id and the remaining fields are defined based on id and not only on id .

However, the estado fields are determined by the cidade fields, which in turn are determined by the rua fields in the Cliente and cep table in the Agencia table. That is, the cidade and estado fields are violations of the second normal form because they do not depend on the primary key, but on some other field. Your attempt to normalize does not completely fix this problem, but it is already a step in some direction.

The solution would be:

  • Create a Estado table with the codigo and nome fields.

    codigo is the primary key, while nome is another candidate key because we can not have two states with the same name.

  • Create a Cidade table with the codigo_estado , codigo_cidade , and nome fields.

    codigo_cidade is the primary key, codigo_estado is the foreign key for the Estado table.

    The codigo_estado and nome fields are also a candidate key because there can not be two cities with the same name in the same state.

    The nome field alone is not a candidate key because there may be cities with the same name in different states (such as Cascavel, name of a city in Paraná and one in Ceará).

  • Create a Logradouro table with the fields cep , codigo_cidade , bairro and nome .

    cep is the primary key.

    codigo_cidade is the foreign key for the Cidade table.

    One would imagine that codigo_cidade and nome together were a candidate key, and in almost all cities this would be true, but not always. In São Paulo, for example, there are three different streets called "Piracicaba Street". Therefore, the candidate key is codigo_cidade , nome and bairro .

  • Create a Endereco table with the fields id , cep , numero_logradouro and complemento .

    id is the primary key.

    cep is the foreign key for the Logradouro table.

    Interestingly, the street number is not always numeric. For example: " Rua João da Silva, 148-B , house of funds ".

  • In the Agencia and Cliente tables, put a foreign key to Endereco and remove any other address fields.

This model is still not perfect, since the same street can have different ZIP codes and maybe it would be the case to have a neighborhood table and also a table of types of streets (avenue, street, square, mall, court, etc.). However, for your purpose, this should be enough.

    
28.10.2017 / 14:13
4

Normalization exists essentially to resolve redundancies. Do you see any redundancy in it?

Address

None shown. Is it possible for the customer to have more than one address? Is it possible for more than one customer to have the same address? If you can, maybe it makes sense to apply normalization in this case. With just one address it does not make sense to do this separation. Neither the second nor the normal form is applicable.

Note that in the second example you created a table named Cliente Endereço . Why are you putting an agency address on it? It does not make sense.

Let's assume that it is actually an address table in general. It might even be the address there. But for what?

Nothing prevents you from having a table only for addresses, but if there is no repetition of the data you are not doing this by normalization. This makes sense if entities can have more than one address or more than one address.

I did not enter the normalization of the address table because it does not seem to be the focus of the question and in the presented form it may even be normalized, it does not have data indicating what the columns are, not even the types, it could very well be ids of standardized data. You also might not want to normalize this.

Agency

A table has been created to separate the agency that it belongs to. Again, do you have more than one agency that the client can have? It's the same address question. How much money do I think I had to do this? What problem do you think solved it? Normalization needs to solve problems, not cause new ones. I saw no advantage in this.

Best model

Even these cases can be questioned in modern databases. It does not cost so much to keep space for more than one address in the entity itself. It is not always a problem to have a very recurring address record when two entities are in the same address. It's a question of pragmatism.

If you can have these cases, strictly speaking, it should normalize, but an experienced developer will analyze how much effort is worth because it complicates the model making all code and performance difficult. So you have to think about whether it pays off, if it's that necessary.

On the other hand, the whole model may be wrong. This idea of separating entities in customer, supplier, bank, etc. is wrong by nature. At least in the form presented.

Entities are individuals and legal entities (separate). Customer, supplier, bank, carrier, seller, employee, etc. are roles that these people play in this organization. In these tables should only have data on the roles. The data of the person in person should be in the physical and legal persons tables. This is most correct, but again it is possible to pragmatically do otherwise if it makes sense. But you have to choose another way because it's the best and not because it's the only way you know it. Not all cases make sense to separate like this, but in general it is the most correct.

I could point out other possible errors in this template. But it would be speculation because it's wrong knowing all the requirements. The mistakes I see would be according to my experience, not with the actual case, they might be right or wrong.

Standardization

To know how to normalize you need to know the objectives. Be the most formally correct, be the fastest to develop, be the easiest to maintain, be the most performative, be what the teacher taught or the boss did even if it is not the best, or whatever. You can not blindly normalize. Where to normalize and where to stop is something you will learn over time.

28.10.2017 / 13:01
1

Based on the book conversion examples, below is an example of 1FN for 2FN and 2FN for 3FN.

Basedonthedrawingofthebook,Imadeadrawingtotrytohelpwithyourquestion.

As your question limited the conversion only to 2FN, I did not convert from 2FN to 3FN as shown in the book example.

  

Consider the primary keys the words with total underscore, EX: ( CPF ) and the foreign keys as ( _CEP ) with the underline before the word.

I added some non-key attributes in the relationship table CLIENT_AGENT only to make the illustration more coherent since you did not mention any relevant attributes to compose this table.

source: Database Systems 6th edition, authors: Elmasri, Ramez Navathe, Shamkant B. Year: 2011

    
02.11.2017 / 07:58