Why do you say you need to do this?
For earnings, see below.
Does it have to do with normalization?
Yes.
Is it really necessary?
No, but almost. It is very difficult to do right without this association table and in some cases infeasible, even if possible.
What do you miss doing this?
Need for a JOIN
is the primary
And what do you get?
Query flexibility, performance, ease of maintenance, consistency, just to name the main.
Is this recommendation only for MySQL?
No.
Details
A good example is suppliers X . A supplier most likely provides multiple products and it is very common for a product to be supplied by multiple suppliers, especially at wholesale and retail level, but also in the industry when the product is commodity or has perfect replacement.
>
How can you link both? One way is to put in the product itself all the suppliers that can provide it. It may sound weird, but it often works because it's usually a few vendors. At least on tables with variable size this is not a big problem. Of course it has disadvantages. The access is not so simple, you may have to make certain contortions to get what you want, know who provides what, may have performance problems, also because it gets too much together that it is not necessary to complicate the cache and the readings that will be more frequently.
It gets much worse on the other side. If you put all the products that a supplier provides can be a monster, it is common for suppliers to have thousands of products.
Another solution is to repeat the entries, that is, to have a line for each product and supplier that supplies it. This hurts normalization , creates duplication of data, may be even worse in terms of performance, memory consumption , and juggling to get where you want.
The solution that works best is to have a mooring table where basically the relationship of supplier and product. It is small, with the right indexes it is fast access, it does not damage the cache, it does not load too much, it does not duplicate, it is not so difficult to query the data thus, it allows access from both sides in a simple way in most cases.
Of course, it loses some of the locale, requires some JOIN
in most queries, you need to ensure that the data update is done correctly in this table as well, but not very different from other solutions, at least it is an operation inside of the normal relational patterns we need.
This applies to any relational database.
Think about how this works on objects in memory. You will have in the supplier a list with the products that they provide. And the product will have a list with its suppliers. Are they part of the same object? Contrary to what many people think, it does not, the list is another object. Note that you will probably have two lists. You can do the same in the database, but I see no advantage, it's hard to solve this. A table with only two indexes is best.
I'm a bit critical of using non-relational databases because the vast majority of problems are relational. When they start using non-relational technologies for this they bring problems not found when technology is relational, there to solve begin to create other technologies and methodologies to repair the inadequate choice of technology. It does not fit here, but nonrelational models are often useful in one part of the problem, rarely in the whole problem, so with relational evolution to work better with nonrelational data becomes a perfect solution. Even problems that have a feature are often tailored to fit the model, often harming the user experience, although it can simplify development a little.
So if the problem is this, you will almost always have to do this unless the problem can be solved otherwise with ease. Even if you do not need this mooring table, maybe one day you will need it and then you will have to make deep changes to the database. Few applications are prepared to work with different physical database structures, which is even a conceptual error, but there I do not know how much pragmatism has to prevail at all.