According to the MySQL documentation ( string comparation ), the comparisons are not case-sensitive, unless one of the
be case-sensitive , or the collation field is.
So, using =
or LIKE
can also be a case-sensitive search depending on collation , for example:
LIKE _latin1 'ABC%' COLLATE latin1_bin
Using
LIKE BINARY
forces a binary collation independent of collation , ensuring case-sensitive comparison, although it may degrade performance depending on the size of the table. >
On the use of REGEX
vs LIKE
, some considerations should be made:
REGEX
does not use an index, even if present, which will impact performance; an index, if any, as long as you do not start the comparison by LIKE
or %
;
Here's an excellent answer in the English SO about it: link
EDIT : By suggestion I'm copying an interesting part of the above link response:
SELECT * FROM t WHERE a = 'abc' <<-- (case insensitive ) pode usar o índice se existir
SELECT * FROM t WHERE a LIKE 'abc' <<-- (case insensitive como "=") pode usar o índice se existir
SELECT * FROM t WHERE a LIKE 'abc%' <<-- pode usar o índice se existir
SELECT * FROM t WHERE a LIKE 'a%' <<-- pode usar o índice se existir, dependendo da cardinalidade (1)
SELECT * FROM t WHERE a LIKE '%a%' <<-- não vai usar um índice
SELECT * FROM t WHERE a LIKE '_agf' <<-- não vai usar um índice
(1) According to the author of the response, depending on the cardinality MySQL may or may not use an index:
If more than +/- 20% of the rows match the criteria, MySQL will not use an index, because, in that case, doing a full search of the table is faster
>
Therefore, from the performance point of view, it is interesting to consider the presence of an index in the column, collation and the search that will be done (started or not by '_'
or %
) to choose the best way to compare.
Depending on the frequent comparison in the field, starting with _
and data volume, another option to consider is the use of a fulltext index, which is suitable for optimizing this type of research. More information here: MySQL Fulltext Index
EDIT : Clearing a question about %
:
In the example above, from the MySQL documentation, it is specifying the collation you want to use in the comparison, regardless of the original COLLATION
.
As an example, a table with a field that was created so collation
, that is the column name is with the German collation, but you want to compare using Spanish, you could do this: texto VARCHAR(50) CHARACTER SET latin1 COLLATE latin1_german1_ci
I've put an example here: sqlfiddle