Perfomance in MySQL with InnoDB in large data table

5

I currently have a table with about 6 million records that in turn perform a lot of I / O operations, so in designing the project I chose to use InnoDB instead of MyISAM in MySQL, after all, lock would be per page, not per table.

But I have a huge problem, the MAJOR of the queries made in this table is through a ** date period . By concept I tried to partition it, but I came across this InnoDB limitation

What do you suggest to improve the performance of these queries? Given that I have a very large hardware limitation?

The table structure is shown below.

  CREATE TABLE 'sensores' (
    'id' int(11) NOT NULL AUTO_INCREMENT,
    'equipamento_id' int(11) NOT NULL,
    'data_hora' datetime DEFAULT NULL,
    'valor_primario' float(10,6) DEFAULT NULL,
    'valor_secundario' float(10,6) DEFAULT NULL,
    PRIMARY KEY ('id'),
    KEY 'fk_sensor_equipamento_idx' ('equipamento_id'),
    KEY 'data_hora' ('data_hora'),
    CONSTRAINT 'fk_sensor_equipamento_idx' FOREIGN KEY ('equipamento_id') REFERENCES 'equipamento' ('id') ON DELETE NO ACTION ON UPDATE NO ACTION
  ) ENGINE=InnoDB AUTO_INCREMENT=3515782247 DEFAULT CHARSET=utf8;

The operation consists of numerous "sensors" writing equipment read information in this table every 15 seconds.

Most querys performed are similar to instruction

SELECT * FROM sensores WHERE data_hora BETWEEN ? AND ?
    
asked by anonymous 29.01.2014 / 23:31

4 answers

5

A simple selection of this should not be any bug head of seven MySQL run even in a table of 6 million records.

You should however make sure that the fields involved in the condition are indexed, in this case the data_hora field, to allow MySQL to search binary and much more efficiently.

See if creating the following index performance improves:

CREATE INDEX 'data_hora' ON 'sensores' ('data_hora');
    
30.01.2014 / 00:19
4

Short explanation

  • Tune InnoDB to allow the table to stay in memory
  • Tune InnoDB to sync changes every 1 second instead of all the time
  • Reformulate your table. Remove unnecessary indexes, or add new ones

Settings that I recommend that you do not forget to tuning are innodb_buffer_pool_size (to allow the bank to stay in RAM and reduce I / O), innodb_flush_method (avoid OS double dll, requires test) and innodb-flush-log-at-trx-commit . Others can be seen in the reference at the end of this answer.

Long explanation

Partitioning should not help much in your case. As your problem is I / O, the tendency to upgrade and use SSD instead of HDD, or otherwise minimize disk access.

As just swapping to SSD will leave you fast, but still not very fast, you'd better have enough memory and configure your MySQL / MariaDB to allow the entire table to remain in RAM and limit the database to write on the disk changes at intervals no smaller than every second, because even though the database is completely in memory it is a requirement that there be such synchronization.

As for Engine MyISAM , it can not perform worse than Engine InnoDB when updates and writes are high. Engine MEMORY can be useful in some specific cases, but it should be used as the last resource, and often Engine InnoDB can be almost as efficient as Engine MEMORY if well configured.

I know you may have limited hardware, however it will be difficult to optimize this without having at least enough memory. In this situation, the best thing you can do is to recommend the next paragraph.

As for reshaping your table, if your mode of use is often just changing recent records frequently, it is helpful to create two tables, and eventually move from the recent table to the old table. I do this with tables that have much more data than yours and it works very well. But, of course, this is only useful if you do not have UPDATEs in old data. This data splitting is more efficient than using partitioning if it is well planned and allows for easier caching.

References you should read

  • link
  • link
  • link
  • link
  • 30.01.2014 / 00:08
    1
    • One thing you can do is follow @Havernard's suggestion, if not the table does not have constant modifications.

    • The other is in your query to bring only what you really need, SELECT * FROM , and make sure that the field you are going to filter is not null

    • You can also do the search with pagination, maybe you will not will need to view hundreds of thousands of records from a one time

    • >
    30.01.2014 / 00:14
    1

    Not that this answer exactly answers your question. But considering that the inserted records are never changed or deleted and considering that you are doing time-based queries, then the focus of partitioning is on time.

    A very simple way to partition in time is to start creating tables by time period. Something of type sensores_11_2013 , sensores_12_2013 , sensores_01_2014 , etc.

        
    30.01.2014 / 00:21