I get different results when I convert from Character to Decimal in PHP and Java

3

When I convert á to decimal I get the result 225 with this code in Java:

public static int charToDec(String text){return (int) text.charAt(0);}

When I convert á to decimal I get the result 195 with this code in PHP:

function charToDec($text){return ord($text);}

How can I arrange to convert to the same value ??

Remembering that this only happens with "special" characters.

    
asked by anonymous 29.07.2017 / 23:16

1 answer

4

This is because ord() does not support UTF-8, you have two solutions to match the values.

A better explanation of what happens may follow the idea:

$hex = unpack('H*', 'á')['1'];
// = "\xC3\xA1"

echo hexdec($hex['0'] . $hex['1']);
// = 195

So the first byte (% with%) is \xC3 and it is the result of 195 . This is because PHP uses the value of ord() of UTF-8, which are two bytes ( á ), the first being \xC3\xA1 .

Change UTF-8 to ISO-8859, for example:

function charToDec($text)
{

     $text = mb_convert_encoding($text, 'ISO-8859-1', 'UTF-8'); 
     $text = mb_substr($text, 0, 1, '8bit');

     return unpack('C', $text)['1'];  

}

This way:

charToDec('á'); //= 225

charToDec('a'); //= 97

I believe this is sufficient, but I'm not confident that all cases will be the same as Java.

The other way would be to use UTF-8 by default, this would require changing both in Java and PHP, in this case you could use the unpack for example shown above and in Java use some equivalent method.

    
30.07.2017 / 00:15