Erase everything but the smallest per day

4

I have a space problem and there are a lot of useless things being stored, but they are useful during the same day (you will understand).

Actual problem:

There are more than 2 million "lines" and this is taking up more than 1.1GB, I am using a high availability server, in short, this does not allow changing capacity easily,

Solution to the problem:

Delete data that is not the smallest based on the day, but the date is DATETIME and must match DATE. This process would occur every day, to delete the data from the previous day (but as there are already data, would have to delete all the old ones as well.)

Explanation of how it works:

The site saves several data in a product several times a day. Imagine the product with the id "999".

This data ("duplicates") is useful for the same day (ie all data today must exist), so I can not simply check if there is already a larger one today and prevent the insertion, because it must have this "duplication".

  

There is:

      id | Prod | Preco | Data
(id auto) | 999 | 99.99 | 2015-01-25 12:31:00
(id auto) | 999 | 89.99 | 2015-01-25 11:00:00
(id auto) | 999 | 78.99 | 2015-01-25 03:37:00
(id auto) | 999 | 98.99 | 2015-01-24 11:30:00
(id auto) | 999 | 74.99 | 2015-01-24 17:37:00
(id auto) | 999 | 79.99 | 2015-01-23 02:39:00
  

I wish there was only:

      id | Prod | Preco | Data
(id auto) | 999 | 78.99 | 2015-01-25 03:37:00
(id auto) | 999 | 74.99 | 2015-01-24 17:37:00
(id auto) | 999 | 79.99 | 2015-01-23 02:39:00

What I tried to think:

I need something that selects everything that is not the smallest that is on the same day.

This is extremely strange, but the real idea says:

$PegaIDMenor = SELECT id FROM tabela WHERE Preco = (SELECT MIN(Preco) FROM tabela WHERE Data LIKE '2015-01-25%' AND Prod = '999') AND Data LIKE '2015-01-25%' AND Prod = '999' LIMIT 1

// Exemplo apenas, logico que terá que existir funções de query!

$EliminaMaiores = DELETE FROM tabela WHERE id != ' $id ' AND Data LIKE '2015-01-25%' AND Prod = '999'

I thought the date would change by looping and checking, but I believe there should be some better function, only using MySQL to perform the same task, as well as better performance.

I think it's clear what I need. Remembering, I do not need SELECT but DELETE!

    
asked by anonymous 26.01.2016 / 15:48

2 answers

2

I do not know how much to perform, but sql will delete all prices that are not the lowest price of each product on smaller days than today. Some where they should improve to performasse.

DELETE 'precos'
FROM 'precos' 
left join ( 
    SELECT max(id) id
    from precos
    inner join (
        SELECT date(Data) d, Prod pid, min(Preco) p FROM 'precos' 
        WHERE 1
        group by date(Data), Prod
    ) mp on date(precos.data) = mp.d and precos.Prod = mp.pid and mp.p = precos.Preco
    group by date(Data), Prod
) save_data on save_data.id = precos.id
where 
save_data.id is null
and date(Data) < CURDATE();

I recommend not running in production before some hahaha testing. Good luck

    
27.01.2016 / 05:49
1

I suggest you do the following.

Create a table to assist in processing with the select below

select produto, date(datetime), min(preco) from tabela group by produto, date(datetime)

In this way, you will have in a table all the prices you want to stay Then, do a left to delete

delete pc from preco pc left join auxiliar aux on pc.produto = aux.produto and date(pc.datetime)=aux.date and pc.preco = aux.preco where aux.produto is null

When you use the left command, anything unrelated to the other table will be null. This way, when you use where the auxiliary table product is null, you remove all rows that do not reference.

Before executing DELETE, do the SELECT and see if the result is the lines that you really want to delete

    
27.01.2016 / 15:47