To avoid this use everything in the same character set, preferably UTF-8.
When I say everything I want to say
- The Encoding of .php, .js, .css, .html files and the ones with the most text.
- The HTML header in the META tags
- Database Encoding
Eventually it may happen that you have to work with more than one encoding because of different sources such as databases, files like EXCEL worksheets (which only work well with ISO-8859-1), etc.
For these cases use display functions like this
function toUTF8($string)
{
if (function_exists('mb_detect_encoding')) {
$current_encoding = mb_detect_encoding($string, 'UTF-8, ASCII, ISO-8859-1');
$string = mb_convert_encoding($string, 'UTF-8', $current_encoding);
} else {
$string = utf8_decode(utf8_encode($string)) == $string ? utf8_encode($string) : $string;
}
return $string;
}
function toLatin1($string)
{
if (function_exists('mb_detect_encoding')) {
$current_encoding = mb_detect_encoding($string, 'UTF-8, ASCII, ISO-8859-1');
$string = mb_convert_encoding($string, 'ISO-8859-1', $current_encoding);
} else {
$string = utf8_encode(utf8_decode($string)) == $string ? utf8_decode($string): $string;
}
return $string;
}
In some situations, even these functions do not work. It is the case of strings concatenated with more than one encoding (believe it, this is not so unusual) and for such cases the conversion must be done character by character.