Best DB structure with large number of data and filterable columns

5

I have a Mysql database with a table that is taking a long time to complete a query. I would like to know what would be the best database structure indicated for a table with many fields that are filterable, have more than 5 million records and is fed constantly.

All filterable fields are indexed.

The VM in which the database is has the attributes:

Sistema Operacional CentOS 6.X 64 bits
Mysql 5.7
CPU 2 x 2 Cores
6 GB Memória

Currently the structure that exists is as follows:

 CREATE TABLE IF NOT EXISTS 'vendas' (
  'identificador' int(11) NOT NULL,
  'doc' varchar(20) COLLATE utf8_unicode_ci NOT NULL,
  'emissao' date NOT NULL,
  'recno' varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,
  'filial' int(11) DEFAULT NULL,
  'filial_origem' varchar(5) COLLATE utf8_unicode_ci DEFAULT NULL,
  'cliente' int(11) NOT NULL,
  'dap' bigint(15) DEFAULT NULL,
  'referencia' varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,
  'grupo' int(11) DEFAULT NULL,
  'prefixo' varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
  'descricao' varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  'data_hora' varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,
  'quantidade' int(11) DEFAULT NULL,
  'valor_unitario' double DEFAULT NULL,
  'total' double DEFAULT NULL,
  'desconto' double DEFAULT NULL,
  'custo_unitario' double DEFAULT NULL,
  'custo_medio' double DEFAULT NULL,
  'pedido' varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,
  'posicao' varchar(10) COLLATE utf8_unicode_ci DEFAULT NULL,
  'tabela' int(11) DEFAULT NULL,
  'atividade' int(11) DEFAULT NULL,
  'risco' varchar(5) COLLATE utf8_unicode_ci DEFAULT NULL,
  'uf' varchar(3) COLLATE utf8_unicode_ci DEFAULT NULL,
  'municipio' varchar(10) COLLATE utf8_unicode_ci DEFAULT NULL,
  'regiao' varchar(10) COLLATE utf8_unicode_ci DEFAULT NULL,
  'fornecedor' int(11) DEFAULT NULL,
  'vendedor' int(11) DEFAULT NULL,
  'vendedor2' varchar(10) COLLATE utf8_unicode_ci DEFAULT NULL,
  'aplicador' int(11) DEFAULT NULL,
  'tipo' varchar(1) COLLATE utf8_unicode_ci DEFAULT NULL,
  'motivo' varchar(5) COLLATE utf8_unicode_ci DEFAULT NULL,
  'pm' double(15,2) DEFAULT NULL,
  PRIMARY KEY ('identificador','emissao'),
  KEY 'filial' ('filial'),
  KEY 'dap' ('dap'),
  KEY 'referencia' ('referencia'),
  KEY 'grupo' ('grupo'),
  KEY 'tabela' ('tabela'),
  KEY 'atividade' ('atividade'),
  KEY 'municipio' ('municipio'),
  KEY 'fornecedor' ('fornecedor'),
  KEY 'vendedor' ('vendedor'),
  KEY 'aplicador' ('aplicador'),
  KEY 'flaged' ('flaged'),
  KEY 'tipo' ('tipo'),
  KEY 'cliente' ('cliente'),
  KEY 'emissao' ('emissao')
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

This query example is taking about 11 seconds, which is unsatisfactory:

select CONCAT_WS("-",YEAR(emissao),LPAD(MONTH(emissao),2,'0')),
sum(total) as faturamento
from 'vendas' as 'operacao' 
where ('emissao' >= '2009-01-01')
group by 1
order by 1

What could I improve on this structure so that queries like the one in the example bring results with less than 1 second?

------------------ UPDATE

Query result below (with 'issuance' partitioning per annual period):

explain select CONCAT_WS("-",YEAR(emissao),LPAD(MONTH(emissao),2,'0')),
sum(total) as faturamento
from 'vendas' as 'operacao' 
where ('emissao' >= '2009-01-01')
group by 1
order by 1

Theresultofthequeryshowindexfromvendas:

    
asked by anonymous 14.11.2016 / 12:46

1 answer

1

As far as I can see from the output of the EXPLAIN command, your query is not using any index since you are forcing the use of functions for both sorting and grouping.

An alternative to your query would be:

SELECT 
  CONCAT_WS("-",YEAR(emissao), LPAD(MONTH(emissao),2,'0')),
  SUM(total) AS faturamento
FROM 'vendas' AS 'operacao' 
WHERE ('emissao' >= '2009-01-01')
GROUP BY emissao

This query will group the records by day, generating a result a little different from what you have. If you want to group by month and year as your query already did, use this query:

SELECT 
  CONCAT_WS("-",YEAR(emissao), LPAD(MONTH(emissao),2,'0')),
  SUM(total) AS faturamento
FROM 'vendas' AS 'operacao' 
WHERE ('emissao' >= '2009-01-01')
GROUP BY YEAR(emissao), MONTH(emissao)

Note 1 : Instead of using the number 1, indicating the first field of the query, I am using the field name, ensuring index usage.

Note 2 : Since you are using the GROUP BY command in the emissao column, there is no need to use the ORDER BY command as the collation already guarantees the order of the records. p>     

14.02.2017 / 12:01