This is because ord()
does not support UTF-8, you have two solutions to match the values.
A better explanation of what happens may follow the idea:
$hex = unpack('H*', 'á')['1'];
// = "\xC3\xA1"
echo hexdec($hex['0'] . $hex['1']);
// = 195
So the first byte (% with%) is \xC3
and it is the result of 195
. This is because PHP uses the value of ord()
of UTF-8, which are two bytes ( á
), the first being \xC3\xA1
.
Change UTF-8 to ISO-8859, for example:
function charToDec($text)
{
$text = mb_convert_encoding($text, 'ISO-8859-1', 'UTF-8');
$text = mb_substr($text, 0, 1, '8bit');
return unpack('C', $text)['1'];
}
This way:
charToDec('á'); //= 225
charToDec('a'); //= 97
I believe this is sufficient, but I'm not confident that all cases will be the same as Java.
The other way would be to use UTF-8 by default, this would require changing both in Java and PHP, in this case you could use the unpack for example shown above and in Java use some equivalent method.