Subqueries can decrease performance? Myth or truth?

14

Well, I usually only work with frameworks . I work with frameworks MVC and I usually use the ORMs to query the database.

In my case I use Laravel, but I've used other frameworks , and had the recent opportunity to get to know the Entity Framework.

Some users who are experienced in using the Entity Framework have shown me that the ORM itself generates a query SQL that is different from the one we are accustomed to.

In the same way I've already seen this in the framework Laravel.

What I mean is that in some cases these frameworks usually generate some queries with subqueries , instead of using JOIN .

For example, I've already seen a query that, to identify if a user has a publication, was being generated like this:

select * from 
    usuarios 
where 
    (select count(*) from 'posts' where 'usuarios'.'id' = 'posts'.'usuario_id') >= 1

But I believe the same query could be done with JOIN.

Well, now I explain why I'm talking about frameworks before getting into the subject. It is because I have heard several comments from the mouth of "programmers", who say subqueries causes performance problem. But what I can not understand is why the frameworks would then insist on doing something that is harmful (slowing performance is harmful, I meant in that context).

However, if I see it being used in the frameworks , it soon comes to mind that it "has no problem" to use subqueries . And without a problem, you would not understand why the frameworks developers, who would theoretically have enough experience to know what they are doing.

I'm an ISSUE in Github, where someone was reporting the performance problem because of the generated subqueries, I noticed that the library developer asked, "But does your tables contain the indexes set correctly?" , which led me to understand that the problem may sometimes be more in the wrong way than using subquery .

So, based on all my arguments above, I ask:

    Is it true that setting indexes for a table can improve the performance of a subquery or is it rubbish?

asked by anonymous 01.07.2016 / 21:37

3 answers

12
  

Subqueries always cause performance problems, or can this problem occur depending on a specific context?

"Always" is a very heavy expression, it depends a lot on the case. And not only the generated code, but also depends on the database optimizer, how the execution takes place, and of course, the dataset and model that is working. And one more thing I say below.

You have to analyze each case, you always have to measure it.

  

Is it true that setting indexes for a table can improve the performance of a SubQuery or is it nonsense?

True, indexes can be the salvation of the crop if they are set correctly. This can be the factor that really will make a difference. They can not save everything and may bring some minor harms as well (updates get slower, there is more disk space and memory consumption, etc.). Again it depends on what you are doing and the vendor used.

  

Does the query show above, in the example, if it were done otherwise (such as using JOIN), would it improve performance?

This example specifically I do not know to say, but in more common examples I can say that it depends on the database and how would be JOIN , in some it may be that there is some optimization (common good). In others JOIN is only syntactic sugar for the common expression.

    
01.07.2016 / 22:01
6
  

Can SubQueries slow performance?

In summary, they can. But it's not always like this.

  

Subqueries always cause performance problem, or can this problem occur depending on a specific context?

No. It all depends on the query engine of the database and the statistics compiled upon the effort of the query. Indexes can indeed undermine performance if poorly formulated, but the highest performance penalty usually comes from a poor design database schema.

If the database engine supports running subqueries in parallel, the result may even be faster than running joins . There is a lot of questioning of this even in SO . What adequately responds to this is the study of the consultation plan. Each database has its way of getting this study.

  

Is it true that setting indexes for a table can improve the performance of a SubQuery or is it nonsense?

True, depending on the bank we're talking about. In Oracle, for example, a TABLE SCAN may be faster than an index search .

Again, it's worth applying the index and studying the sentence.

  

The query shown above, in the example, if it were done otherwise (such as using JOIN), would it improve performance?

And again, it depends. In SQL Server, for example, usually JOINS are faster than subqueries , but this may vary.

Here is an article on how to get your query execution plan in SQL Server .

    
01.07.2016 / 21:58
5

Wallace, this depends a lot on the database engine, taking the following two queries as an example:

--Query A
SELECT 
    A.TabelaAID,
    A.Descricao as DescricaoA
    (SELECT Descricao FROM TabelaB B WHERE A.TabelaBID = B.TabelaBID) as DescricaoB,
    (SELECT Descricao FROM TabelaC C WHERE A.TabelaBID = C.TabelaBID) as DescricaoC,
    (SELECT Descricao FROM TabelaD D WHERE A.TabelaBID = D.TabelaBID) as DescricaoD
FROM TabelaA A

--Query B
SELECT 
    A.TabelaAID,
    A.Descricao as DescricaoA
    B.Descricao as DescricaoB,
    C.Descricao as DescricaoC,
    D.Descricao as DescricaoD
FROM TabelaA A
LEFT JOIN TabelaB B ON A.TabelaBID = B.TabelaBID
LEFT JOIN TabelaC C ON A.TabelaBID = C.TabelaBID
LEFT JOIN TabelaD D ON A.TabelaBID = D.TabelaBID

No SqlServer 2005 to Query A was exponentially slower than Query B , all because the engine could not translate subquery to join . as a result, if TableA had 1000 records, Query A would execute 3001 queries and Query B only 1.

No SqlServer 2014 , there are still some performance differences, but nothing extraordinary, depending on the query subquery will be even faster than join .

Now regarding your example, I think it would perform better if you use EXISTS(SELECT 1 FROM post WHERE post.usuario_id = usuarios.id) instead of COUNT with * .

The following article has a great comparison that involves a particular type of subquery, which is very similar to its example:

Should I use NOT IN, OUTER APPLY, LEFT OUTER JOIN, EXCEPT, or NOT EXISTS

    
01.07.2016 / 22:02