How to remove duplicate lines efficiently?

7

I'm normalizing a table here and found that there are duplicate rows.

The way I chose to clean these lines is through:

CREATE TABLE tabela_nova AS ( 
    SELECT DISTINCT * FROM tabela_antiga
);

The problem this way is that it is very slow and depending on the size of the bank, it takes more than a day.

Would there be a more efficient way to eliminate duplicate rows?

    
asked by anonymous 10.06.2015 / 01:56

1 answer

12

Assuming you use the MySql database and have a table named "names", and that table has the id and name fields. This syntax can be used in other banks, we will use the following command:

DELETE a FROM nomes AS a, nomes AS b WHERE a.nome=b.nome AND a.id < b.id

Notice that in the sql command after FROM I double-click the "names" table, but I differentiate them by the letters a and b. You could give whatever name you wanted. Note also that after the WHERE I make the comparison between the columns, checking the duplicity and then say that the id of "a" should be less than "b". This way MySql will compare all records with the same name and delete those with the lowest id.

names: The table with the duplicate records. name: This is the field for comparing records. id: This is the primary key of the table.

See how it works:

Tables with duplicate records

Applyingthescriptdescribedabove:

If you want to delete all duplicate records, leaving only the unique records, just change the "

10.06.2015 / 02:08