select like in more than 3 million rows

6

I have a Maxmind table, with more than 3 million cities. I created the index for a simple select and it is performing satisfactorily.

The problem is that I am using an autocomplete and for this my select uses like , but always returns timeout (Maximum execution time of 30 seconds exceeded).

I chose to use MyISAM because it is a table with low or no changes.

From what I've been reading, 3 million is not an excessive amount, but I do not know how to optimize this search.

CREATE TABLE 'cidades' (
  'id' int(11) NOT NULL AUTO_INCREMENT,
  'pais' char(50) DEFAULT NULL,
  'regiao' char(100) DEFAULT NULL,
  'cidade' char(100) DEFAULT NULL,
  'titulo' char(100) DEFAULT NULL,
  'lat' char(50) DEFAULT NULL,
  'lon' char(50) DEFAULT NULL,
  PRIMARY KEY ('id'),
  KEY 'ID' ('id'),
  KEY 'CITY' ('cidade'),
  KEY 'ID_2' ('id','cidade','titulo')
) ENGINE=MyISAM AUTO_INCREMENT=3166036 DEFAULT CHARSET=utf8;
select cidade.titulo
from city
where cidade.titulo like '%Rio de Janeiro%'
    
asked by anonymous 20.04.2015 / 20:17

1 answer

5

Cause of bad performance

Using % at the start of the like parameter causes a full table scan . It prevents the RDBMS from using the table index, needing to always read the field completely. This then effectively becomes an oversized operation for 3 million lines.

Limited solution

Removing % from the start, the RDBMS will then be able to get the first few characters before the % is to directly parse the index to do an initial filter of the results. Only then will it test the like against the full value of the few selected rows.

MySQL also has an option to set how many characters should be included in the index of text fields.

Withdrawing % from the beginning solves the performance problem, but may not meet the requirement if self-completion is of the type that partially searches the entire text. In that case it would be better to change the approach, that is, not to use like .

Complete solution

An alternative would be Full-Text Search provided by MySQL. I do not have performance data as I have not used this functionality in MySQL yet.

Another more advanced alternative would be to use a tool like Apache Lucene , a search engine developed in Java used by many sites and projects that can be integrated into your back end with PHP.

    
20.04.2015 / 22:42