The main difference is how
utf8_unicode_ci make comparisons similar to some phonemes.
For example, in the German language the character "ß" would be equivalent to "ss". As
utf8_unicode_ci has to do this type of comparison by combining more than one character, it is slow than
That is, if your application does not need cross-language comparisons, go from
But considering systems that work globally and should work with multiple languages, such as a Wordpress or Wikimedia for example, using
utf8_unicode_ci is a good way out.
Another interesting chartset to mention is
utf8_bin . It is based on the bitwise comparison of the characters, resulting in a case-sensitive comparison, unlike the other collations.
The choice of collation depends a lot on the nature of our application. In addition to
uft8 , there are other charsets to meet the needs of a specific region (
latin1 for example) and as each scope varies a lot, I do not think it is possible to point the most appropriate for all cases.
In most cases, the
utf8_general_ci will match, as, as the name suggests, it is for general and more common use to be found. However, it is interesting to know that there are other collations that can meet a more specific need, such as
Source: MySQL Documentation in