Difference between relational table and online record

2

I have a structure of companies and categories organized into 3 tables:

  • empresa - > Register of all registered companies.
  • categoria - > Registration of all registered categories.
  • rel_categoria - > Relationship between empresa and categorias .

Where I store the data as follows in the rel_categoria table:

id (primary_key) | id_empresa | id_categoria

And when I need to get the categories of a company, I make the selection related. Until then everything ok.

But I had the following question: Why use an extra table to save these records and I can store them in a column in the empresa table?

Example in the empresa table:

id | nome_empresa |  categorias  | [..etc..]
 1 | 'Lojinha'    | 1, 7, 14, 16 | ...

The only reason that comes to mind would be in the case of counting. Eg: How many companies use the X category?

Is there any other reason? Any structural issues? Performance? Etc .. I manage the bank through MySql and PHP .

    
asked by anonymous 14.06.2016 / 21:09

2 answers

3

The structure that you have set up, with business tables and categories and a third relationship table obeys the concepts used in relational database .

From here we can start with questions like modeling , normalization , normal forms , etc ...

On the assumption that a COMPANY may have several CATEGORIES , and a CATEGORY may be present in several COMPANIES , we have a strong N: N relationship > that is represented in the database using this relationship table .

This also impacts performance . For if you mount the normalized structure, with primary and foreign keys, indexes and the like, the queries are optimized .

Imagine if you store all categories of a company in a column of type varchar , the query performance will fall because you will need to use string treatment to find the desired information, among other things.

    
14.06.2016 / 21:25
1

You've answered.

Although you can also do the counting or other types of search if you make a column with the categories of it. Whether or not it pays off depends a lot on what you want to do. You can almost always do well with the multi-valued column. It is more complicated to access category data in this format. Nothing exaggerated, but someone can look at your query and not understand what it does. You're more likely to make a mistake.

You only need an extra table even when you access categories with no link to the company. It's rare to need this. But in other situations this may not be so true. You can even do without the extra table, but it can be very slow to have to look at all companies to find out what they want, but I doubt this will run in most situations, especially in this specific case reported.

On the other hand creating an extra table will almost always require a JOIN which is usually a costly operation and it is good to avoid doing it decently.

If you are some software that requires one way or another, then you have to follow the rules. Or try to drop this requirement.

Normally I would go from column to table without extra, at least until I have an indication that this would be bad, but I do not know your specific case.

The staff very much preaches standardization without analyzing the actual case. One thing is to normalize what needs to be normalized, what makes sense for that model, another is to force normalization where it does not need just because the person has learned that normalizing is always good.

    
14.06.2016 / 21:27