How to export records in SQL without duplicates?

5

I'm trying to remove some duplicate records from a table and I searched the Internet how to do this, and found something about distinct .

My scenario is:

I have a table that has a record with all duplicate columns.

ID | Nome   | Idade
1  | Teste  | 20
1  | Teste  | 20
1  | Teste  | 20
2  | Teste2 | 28

Now I'm trying to export to a temporary table with distinct , but when I export only with ID in my query the query is right and exports without the duplicates:

SELECT DISTINCT t.ID
INTO Temp
FROM tabela t

however if I do:

SELECT DISTINCT t.*
INTO Temp
FROM tabela t

It exports everything up to the duplicates.

How can I export all records with distinct or without%?

    
asked by anonymous 19.11.2014 / 19:28

3 answers

1

I am "stealing" an answer I found in the English OS ;)

You must group the records. I think we can ignore the repeated ID's (let me know if that does not come in handy). So we'll use all fields except the ID field. Something like:

SELECT MIN(ID) as ID, Nome, Idade 
FROM Temp
GROUP BY Nome, Idade

Note that you can use MAX instead of MIN ... What matters is to get a single ID for each duplicate group of records.

In practice it happens that each distinct information of the system will be obtained. And for each group of multiple repeated information, only the smallest ID (or greater, if you use MAX instead of MIN ) will be obtained.

When you have these results, you have two alternatives to fulfill your goal:

  • You can export the result of this query. You will only have the distinct data in the export result;

  • The least recommended way is to delete all repeated records from the table. This requires courage, since any error can delete data other than the one you want to delete. I recommend having a backup if you want to go this way.

The command is anything like the following:

DELETE * 
FROM Temp
LEFT OUTER JOIN (
    SELECT MIN(ID) as ID, Nome, Idade
    FROM Temp
GROUP BY Nome, Idade
) as RegistosAManter ON
Temp.ID = RegistosAManter.ID
WHERE
   RegistosAManter.ID IS NULL

The query from FROM retrieves all the records in the table. The ID's of each record appear twice (because we are using a JOIN), but on the right side of the result the IDs of the repeated elements will be null. The delete command will remove these records from the table.

    
19.11.2014 / 20:21
1

You can use the INTERSECT command to remove duplicate rows from the final query.

SELECT *
FROM   SuaTabela
WHERE  ColunaDesejada BETWEEN 1 AND 100

INTERSECT

SELECT *
FROM   SuaTabela
WHERE  ColunaDesejada BETWEEN 50 AND 200;

Hope you can help.

    
22.11.2014 / 11:07
0

I first believe that your "same as for testing" model must have a primary key, which does not should allow the " ID " column to receive duplicity, but OK , let's start with something like testing.

In this case, I believe you only resolve using group by.

create table t (ID int, Nome varchar(20), Idade int)
go

insert into t values 
(1, 'Teste', 20),
(1, 'Teste', 20),
(1, 'Teste', 20),
(2, 'Teste', 28)
go

select t.id, t.nome, t.idade , count(1)
from t
group by t.id, t.nome, t.idade 
    
20.11.2014 / 13:14