Not IN or Not EXISTS to use?

42

I have seen some answers here with Not In and some using Not Exists to answer the same question.

I do not know the difference between both and would like to know about:

  • Which of the two is best when considering performance?
  • On what occasions should I use both?

If possible of examples with Query!

    
asked by anonymous 08.05.2015 / 16:08

3 answers

38

This will depend on the situation. The not in is recommended when you already have the values and passes as a parameter, for example:

select * from tbl where status_id not in (1,3,7)

Now if you are going to make a subselect I recommend using not exists, because in the first result that it finds it will already validate and move to the next record.

select * from user u where not exists (select 1 from player p where p.user_id = u.id limit 1)

Since I do not need the fields in the player table, return 1 is faster than *. The limit 1 (changes the way of writing according to the database) will cause that when finding 1 record already is enough to say that it exists. Update: In some situations limit 1 may leave faster according to this response .

I hope it has become clear.

Performance: As Paul quoted in the answer below, not exists allows you to use index, which in tables with many records can greatly improve performance against not in.

    
08.05.2015 / 16:20
10

Complementing the above answer. The goal is now to find all employees who are not managers. Let's see how we can achieve this using NOT IN and NOT EXISTS .

NOT IN

SQL> select count(*) from emp_master where emp_nbr not in ( select mgr_nbr from emp_master );
COUNT(*)
———-
0

NOT EXISTS

SQL> select count(*) from emp_master T1 where not exists ( select 1 from emp_master T2 where t2.mgr_nbr = t1.emp_nbr );

COUNT(*)
———-
9

Now there are 9 people who are not managers. So you can clearly see the difference that NULL values make since NULL! = NULL in SQL, the NOT IN clause does not return any registro back.

Performance Implications:

When using NOT IN , the query performs nested table scans, whereas for NOT EXISTS , the query can use an index within the sub-query.

    
11.05.2015 / 13:55
5

EXISTS is more performative than IN , but it is not a good practice, it depends on your need ....

The in "creates" a structure of or's in its sql condition, exp:

select * from tabela1
where idtabela in (select idtabela from tabela2 where condicao = 2)

Let's say there are 5 conditions = 2 of idtabela in this table2, bd would interpret in as:

select * from tabela1 t
inner join tabela2 t2 on t2.idtabela = t.idtabela
where idtabela = 1 or idtabela = 2 or idtabela = 3 or idtabela = 4 or idtabela = 5

Now when using exists bd "forces" a relationship between the two tables Getting more or less like this:

select * from tabela1 t
inner join tabela2 t2 on t2.idtabela = t.idtabela
where condicao = 2
    
11.05.2015 / 15:02