Return all equal items from different groups

20
-------------------
- TABLE           -
-------------------
ID | GRUPO | OBJETO
---|-------|-------
1  | 1     | 1
2  | 1     | 2
   |       |
3  | 2     | 1
4  | 2     | 2
   |       |
5  | 3     | 1
6  | 3     | 2
7  | 3     | 3
   |       |
5  | 4     | 1
6  | 4     | 3
   |       |
7  | 5     | 1

I need to make a query to return all groups that contain exactly the same objects . I've already used subquery with IN but the objects and 2 / em>.

For example, if I query using where GRUPO = 1 , I need to return only a total of two records (group 1 and group 2).

The objects in group 1 are also in group 2. But group 3 contains an object in addition, group 4 contains two more objects one of them is different from that of group 1, and group 5 contains only one object; therefore, groups 3, 4 and 5 should not be considered.

There are some relationships but they do not make much difference in the case.

  • Group 1 bought: orange and apple
  • Group 2 bought: orange and apple
  • Group 3 bought: orange, apple and banana

If I query for GRUPO = 1 , I need to return ONLY and ONLY who else bought JUST orange and apple >. I do not want to know what orange and apple groups are, so IN does not work.

    
asked by anonymous 03.09.2015 / 09:53

4 answers

23

I think the following solution returns the result you want.

This query returns all groups that have exactly the same elements as group 1.

SELECT T.GRUPO
FROM   TESTE T
LEFT JOIN 
(
  SELECT DISTINCT T1.OBJECTO,
         (SELECT COUNT(DISTINCT T2.OBJECTO) 
            FROM TESTE T2
           WHERE T2.GRUPO = T1.GRUPO) TOTAL
  FROM TESTE T1
  WHERE T1.GRUPO = 1   --filtras aqui o grupo
) SS
  ON SS.OBJECTO = T.OBJECTO
GROUP BY T.GRUPO
HAVING COUNT(DISTINCT T.OBJECTO) = COUNT(DISTINCT CASE WHEN SS.OBJECTO IS NOT NULL THEN SS.OBJECTO END)
   AND COUNT(DISTINCT T.OBJECTO) = MAX(TOTAL)

The logic behind this query is as follows:

A subquery

SELECT DISTINCT T1.OBJECTO,
       (SELECT COUNT(DISTINCT T2.OBJECTO) 
          FROM TESTE T2
         WHERE T2.GRUPO = T1.GRUPO) TOTAL
FROM TESTE T1
WHERE T1.GRUPO = 1

aims to return all objects in group 1. This is the basis for the process. Initially, this statement only returned the objects, it was necessary to change to return the total number of objects as well.

With this result we have all that is necessary to look for the remaining groups that have exactly the same elements. This is done through the following instructions:

HAVING COUNT(DISTINCT T.OBJECTO) = COUNT(DISTINCT CASE WHEN SS.OBJECTO IS NOT NULL THEN SS.OBJECTO END)
AND COUNT(DISTINCT T.OBJECTO) = MAX(TOTAL)

The first one means, "all elements of the group (COUNT (DISTINCT T.OBJECTO)) must be in group 1 (COUNT (DISTINCT CASE WHEN SS.JECT IS NOT NULL THEN SS.OBJECTO END)) "

Finally, the statement COUNT (DISTINCT T.OBJECTO) = MAX (TOTAL) ensures that the group must have the same number of elements as group 1. This statement is key to excluding the groups that, as in your example, have only orange. Orange is in group 1 but is fantasizing about the apple.

MAX is required because in the HAVING statement we can only use aggregate functions or constants.

If you want to get the objects for each of the groups (the objects will always be the same), just use the previous query to filter the results, for example

SELECT T.GRUPO,
       T.OBJECTO
FROM   TESTE T
INNER JOIN
(
    SELECT T.GRUPO
    FROM   TESTE T
    LEFT JOIN 
    (
      SELECT DISTINCT T1.OBJECTO,
            (SELECT COUNT(DISTINCT T2.OBJECTO) 
               FROM TESTE T2
              WHERE T2.GRUPO = T1.GRUPO) TOTAL
      FROM TESTE T1
     WHERE T1.GRUPO = 1    --filtras aqui o grupo
    ) SS
      ON SS.OBJECTO = T.OBJECTO
    GROUP BY T.GRUPO
    HAVING COUNT(DISTINCT T.OBJECTO) = COUNT(DISTINCT CASE WHEN SS.OBJECTO IS NOT NULL THEN SS.OBJECTO END)
       AND COUNT(DISTINCT T.OBJECTO) = MAX(TOTAL)
) X
  ON T.GRUPO = X.GRUPO
ORDER BY 1, 2

Here's also SQLFiddle

    
04.09.2015 / 00:09
5

Solution:

select A.GRUPO
from TABELA A join TABELA B on B.OBJETO = A.OBJETO
where B.GRUPO = 1
group by A.GRUPO
having count(A.OBJETO) = (select count(OBJETO) from TABELA where GRUPO = 1)
and count(A.OBJETO) = (select count(OBJETO) from TABELA where GRUPO = A.GRUPO)

This query works and seems pretty simple to me.

Explaining the line of reasoning:

With the first part of the query (from select to where ) I return records that have at least one of the objects in group 1.

For a better view, see this first part of the slightly modified query:

select A.ID, A.GRUPO, A.OBJETO
from TABELA A join TABELA B on B.OBJETO = A.OBJETO
where B.GRUPO = 1

The result of this query would be:

-------------------
ID | GRUPO | OBJETO
---|-------|-------
1  |  1    |    1
2  |  1    |    2
   |       |
3  |  2    |    1
4  |  2    |    2
   |       |
5  |  3    |    1
6  |  3    |    2
-7 |  3    |    3
   |       |
8  |  4    |    1
-9 |  4    |    3
   |       |
10 |  5    |    1

The records marked - ) are left out of the result since their object does not match any of the objects in group 1.

Then I see that until now I only return the groups that have some object of group 1, and also their respective object.

Let's now visualize the grouping of this:

select A.GRUPO, count(A.OBJETO)
from TABELA A join TABELA B on B.OBJETO = A.OBJETO
where B.GRUPO = 1
group by A.GRUPO

Result:

---------------------
GRUPO | COUNT(OBJETO)
------|--------------
  1   |     2
  2   |     2
  3   |     2
  4   |     1
  5   |     1

Now I'm going to filter to keep only the groups whose aggregation of objects is equal to the total of objects in group 1, by adding the having clause:

select A.GRUPO, count(A.OBJETO)
from TABELA A join TABELA B on B.OBJETO = A.OBJETO
where B.GRUPO = 1
group by A.GRUPO
-- novo:
having count(A.OBJETO) = (select count(OBJETO) from TABELA where GRUPO = 1) 

Result

---------------------
GRUPO | COUNT(OBJETO)
------|--------------
  1   |     2
  2   |     2
  3   |     2

Group 3 has to quit because although its count has been set to 2 (because it actually has two objects matching the objects in group 1), actually it has 3 objects in the base (one of them does not match any objects in group 1 and so was left out of count ).

So I add one more condition in having , determining that only groups whose total objects on the base are equal to the total of objects in group 1 are kept:

select A.GRUPO, count(A.OBJETO)
from TABELA A join TABELA B on B.OBJETO = A.OBJETO
where B.GRUPO = 1
group by A.GRUPO
having count(A.OBJETO) = (select count(OBJETO) from TABELA where GRUPO = 1)
-- novo:
and count(A.OBJETO) = (select count(OBJETO) from TABELA where GRUPO = A.GRUPO) 

Finally, I remove count from select because I do not want it in the result view, and I get to the query that was presented at the beginning of the answer. >

See the SQL Fiddle .

    
09.09.2015 / 18:45
3

The only way I know how to do is with GROUP_CONCAT , and since it is not possible to join with calculated fields will require two subqueries (same, you can even create a view ):

SELECT t2.*
FROM (
    SELECT grupo
    , GROUP_CONCAT(objeto ORDER BY objeto) objetos
    FROM tabela GROUP BY grupo
) t1 INNER JOIN (
    SELECT grupo
    , GROUP_CONCAT(objeto ORDER BY objeto) objetos
    FROM tabela GROUP BY grupo
) t2 on t1.objetos = t2.objetos
WHERE t1.grupo = 1

I did a SQL Fiddle demonstrating how Group 3 works with 3 objects.

    
03.09.2015 / 21:36
-6

Follow SQL

SELECT count(distinct grupo,objecto) as tabela FROM tabela WHERE objecto in (1,2) and grupo IN (1,2) group by objecto.

Make sure this is what you want. Next time try to be clearer.

    
03.09.2015 / 12:57