SQL code optimization

12

How can I optimize the following code so I do not use 3 SELECTs and do not match the query to just 3 status

(
    SELECT * 
    FROM historico 
    WHERE 
        his_status = 'FRACASSO' 
    ORDER BY his_data DESC 
    LIMIT 50
)
UNION ALL
(
    SELECT * 
    FROM historico 
    WHERE 
        his_status = 'REAVALIAR' 
    ORDER BY his_data DESC 
    LIMIT 50
)
UNION ALL
(
    SELECT * 
    FROM historico 
    WHERE 
        his_status = 'SUCESSO' 
    ORDER BY his_data DESC 
    LIMIT 50
)

Currently I have 2562 records in the historical table and I'm using all of them to train an RNA. It costs 3 to 4 minutes. This query will enable the user to enter how many samples of each status you want to use. Amount 50 is an example. The optimization of the query is also to reduce the cost of time.

    
asked by anonymous 11.11.2017 / 15:58

2 answers

10

Here is an alternative that can bring up to 50 records of each status sorted by date. In the event of a tie between two records the result is stripped by id (the lowest id prevails).

SELECT h.*
FROM historico AS h LEFT JOIN historico AS h2
    ON h.his_status = h2.his_status 
       AND (h.his_data < h2.his_data OR (h.his_data = h2.his_data AND h.id > h2.id))
WHERE h.his_status IN ('FRACASSO', 'REAVALIAR', 'SUCESSO')
GROUP BY h.id
HAVING COUNT(*) < 50
ORDER BY h.his_status, h.his_data DESC, h.id;

See working in DB Fiddle

This version is more compact. You can filter as many% as you want%. You can even remove the status clause to return the first 50 results of all where of your table.

I can not compare the performance of this solution with the original version with status . This has to be done with your complete , with the exact same structure of tables and indexes, in the same version of MySQL with the same settings. Having said that the link above shows the execution plan for a small amount of data in the absence of any index beyond that created implicitly by the PK, using MariaDB 10.2. Under these conditions the original version is scanning the entire table for each% desired%. The above version, although it also sweeps the entire table twice and uses a temporary table, does so regardless of the amount of UNION ALL in the filter.

    
11.11.2017 / 23:00
10

You can do it this way. It will return only those records with a "line" less than or equal to 50 (that is, 50 records at most of each type):

SELECT *
FROM 
(SELECT *,
    (@numero:=IF(@status = 'his_status',
     @numero+1,
     IF(@status := 'his_status', 1, 1))) linha 
  FROM historico a
  CROSS JOIN (SELECT @numero:=0) b
  ORDER BY his_status, his_data DESC
) AS c 
WHERE c.linha <= 50;

SQLFiddle by referral from friend @Everson in comments: link

    
11.11.2017 / 18:20