Is it good practice to keep foreign key columns redundant?

7

Consider that we have the models A , B , C and D , each referencing the previous one, ie A has many B s, B has many C if C has many D s:

No redundancy

A
|  id | ... |
------------
| ... | ... |

B
|  id | a_id | ... |
--------------------
| ... |  ... | ... |

C
|  id | b_id | ... |
--------------------
| ... |  ... | ... |

D
|  id | c_id | ... |
--------------------
| ... |  ... | ... |

It would be advisable to include more columns in C and D indicating which record referenced A and B ?

With redundancy

C
|  id | a_id | b_id | ... |
---------------------------
| ... |  ... |  ... | ... |

D
|  id | a_id | b_id | c_id | ... |
----------------------------------
| ... |  ... |  ... |  ... | ... |

It's a redundancy, but I usually do this to make it easier to query, it's useful to avoid JOIN s. I also believe that% s% s economy improves performance.

Is this considered good or bad practice? (If not in all cases, at least when the data is immutable) Is there any better solution for this?

    
asked by anonymous 23.06.2014 / 19:56

2 answers

5
  • You did not ask for personal opinion, asked if it was good practice. In addition StackOverflow does not propose to respond based on personal opinion and recommends that you do not ask questions of this type, let alone give answers based on personal opinions.

  • No, it is not good practice. It really sucks. It is not a personal opinion see the points below. Also, if you want personal opinions, ask these types of questions in specialized forums like SQL Server Central (and get ready for possible "hard" answers).

  • It's neither a matter of DRY nor even standardization. We should denormalize only when strictly necessary (for performance). These are rare cases.

  • You will only write fewer joins, not necessarily DBMS will be able to give you more performance. On the contrary it can get worse because as you include redundant data and use indexes you will also take up more space on the HD (more IO), force more CPU and memory usage.

  • Usually the number of joins does not matter. If you indexed well and normalized correctly you should already achieve good performance.

  • If you have a very large data mass and complex relationships there are other very better ways to achieve performance like using views, partition tables, fine index tunning, etc.

  • 26.08.2014 / 21:32
    6

    My personal opinion is that the benefit does not compensate for the cascading effect of updates and integrity validations on a large mass of data.

    • Several DBMSs implement execution plans where dependency between structures is taken into account. By increasing the complexity of the relationships described by the foreign keys, you may be harming the evaluation of this plan;
    • If you delete or modify a A table item, still all B , C and D dependent will need to be validated for consistency. The proposed model offers no benefit to data change operations (% with% /% with%).
    • If you change the type of UPDATE , you will need to make changes to all tables, not just the type of DELETE .
    • Several ORM tools depend on a correct representation of the interdependence between tables in the database.

    If your only advantage in this implementation is the agility to create queries , I would make the extra effort to correctly describe the interrelationship.

    (In addition, there is a system development principle called do not repeat < strong> (Do not repeat yourself), that some DBAs are also used because they consider it a good choice for data hygiene.)

        
    24.06.2014 / 01:46