Is it really necessary to create a 3rd auxiliary table in N-N relationships?

Question

Is it really necessary to create a 3rd auxiliary table in N-N relationships?

Navigation

#1 by (12 votes)
#2 by (7 votes)

9

In several places I see that when you have a N - N relationship (if memory does not crash) it is recommended to create a 3rd helper table and then later it will be transformed into a 1 - 1 relationship ( also if memory does not fail me).

Why do you say you need to do this? Does it have to do with normalization?
Is it really necessary? What do you miss doing it? And what do you get?
Is this recommendation only for MySQL?

mysql modelagem

asked by anonymous 17.07.2017 / 16:44

2 answers

7

First of all, I will cite, from books, concepts about cardinalities N x N to later answer your questions:

According to the authors, when relations between two tables are N x N :

Leite, Mario (2008) :

In the N - N cardinality it is very difficult for the programmer to convert this semantics into code, since this is done by creating a new entity (Associative Entity) that will relate to the original entities with the cardinality N - 1 .

Costa, Rogério Luiz de C. (2006) :

When cardinality is N - N , a new entity must be created to represent the relationship [...] between the entities involved in the relationship.

Morelli, Eduardo Terra (2009 ) :

Initially all N - N relations must be broken in relation N - 1 , plus a table. This is because multiple relationships ( N - N ) in real database are impossible.

Considerations

At no time in the cited books was it explained, in fact, why we should use the creation of one more entity in N - N relationships.

However, I could see that everyone agreed that adopting a more entity approach reduces the complexity of implementing logic, both at the time of coding, and saving, holding, and extracting information from Relational DBMS .

Now about your questions:

Why do you say you need to do this? Has it to do with standardization? You do not need to add another table in N - N relationships. In my opinion it is more a matter of a better deal with the intention of reducing the complexity when it comes to coding and working the data in relational DBMSs.

Is it really necessary? What do you miss doing it? And what do you get? When creating the relationship model of a project I have as a focus / treatment or best practices, call as you want, create a system that meets the need in the simplest possible way so that maintenance is also as simple as possible.

About what you lose or win. I believe that, based on the citations, be the productivity at the time of coding. I may be wrong, but 1 - N ou N - 1 relationships are less complex to deal with. If I use the "convention" to create an entity for N - N relations that decreases my complexity, I will be gaining in productivity.

Is this recommendation only for MySQL? No.

According to Morelli, relational modeling (MER) was created to describe the data stored in tables.

That is, this convention is for any relational DBMS.

17.07.2017 / 18:08

Disconnecting case sensitive in Postgres How to crop an image to the center when uploading the same

score 12 · Accepted Answer

Why do you say you need to do this?

For earnings, see below.

Does it have to do with normalization?

Yes.

Is it really necessary?

No, but almost. It is very difficult to do right without this association table and in some cases infeasible, even if possible.

What do you miss doing this?

Need for a JOIN is the primary

And what do you get?

Query flexibility, performance, ease of maintenance, consistency, just to name the main.

Is this recommendation only for MySQL?

No.

Details

A good example is suppliers X . A supplier most likely provides multiple products and it is very common for a product to be supplied by multiple suppliers, especially at wholesale and retail level, but also in the industry when the product is commodity or has perfect replacement.

>

How can you link both? One way is to put in the product itself all the suppliers that can provide it. It may sound weird, but it often works because it's usually a few vendors. At least on tables with variable size this is not a big problem. Of course it has disadvantages. The access is not so simple, you may have to make certain contortions to get what you want, know who provides what, may have performance problems, also because it gets too much together that it is not necessary to complicate the cache and the readings that will be more frequently.

It gets much worse on the other side. If you put all the products that a supplier provides can be a monster, it is common for suppliers to have thousands of products.

Another solution is to repeat the entries, that is, to have a line for each product and supplier that supplies it. This hurts normalization , creates duplication of data, may be even worse in terms of performance, memory consumption , and juggling to get where you want.

The solution that works best is to have a mooring table where basically the relationship of supplier and product. It is small, with the right indexes it is fast access, it does not damage the cache, it does not load too much, it does not duplicate, it is not so difficult to query the data thus, it allows access from both sides in a simple way in most cases.

Of course, it loses some of the locale, requires some JOIN in most queries, you need to ensure that the data update is done correctly in this table as well, but not very different from other solutions, at least it is an operation inside of the normal relational patterns we need.

This applies to any relational database.

Think about how this works on objects in memory. You will have in the supplier a list with the products that they provide. And the product will have a list with its suppliers. Are they part of the same object? Contrary to what many people think, it does not, the list is another object. Note that you will probably have two lists. You can do the same in the database, but I see no advantage, it's hard to solve this. A table with only two indexes is best.

I'm a bit critical of using non-relational databases because the vast majority of problems are relational. When they start using non-relational technologies for this they bring problems not found when technology is relational, there to solve begin to create other technologies and methodologies to repair the inadequate choice of technology. It does not fit here, but nonrelational models are often useful in one part of the problem, rarely in the whole problem, so with relational evolution to work better with nonrelational data becomes a perfect solution. Even problems that have a feature are often tailored to fit the model, often harming the user experience, although it can simplify development a little.

So if the problem is this, you will almost always have to do this unless the problem can be solved otherwise with ease. Even if you do not need this mooring table, maybe one day you will need it and then you will have to make deep changes to the database. Few applications are prepared to work with different physical database structures, which is even a conceptual error, but there I do not know how much pragmatism has to prevail at all.