How to find the original encoding of a filename (or any string)? [closed]

5

I have a series of files that appear to have been generated on different operating systems, since the character encoding of their names seems to vary between them.

There are names whose accents usually appear to me, both on OSX and Linux (with terminal configured in UTF-8 in both cases), while others have become strange. For example, there is a section where I see APRESENTAÇO_MAR_2015 where clearly the word should be APRESENTAÇÃO .

Looking more carefully at the problematic section ÇO , I found 5 values (in hexadecimal):

0xC2
0x80
0x43
0x327
0x4F

I tried to convert the string with iconv , varying the input and output encodings, but I did not get the desired result ( ÇÃO ). How to find out the original coding of these names and correct them? I have many files with problem, and I would like to solve this programmatically.

    
asked by anonymous 21.01.2016 / 14:19

1 answer

-1

Navigate to the folder and type:

file -I meuarquivo.extensao

If it is a result from a database, you can use the PHP language to discover charset using the mb_check_encoding function.

if(mb_check_encoding($row['campo'], 'UTF-8')){
    echo"verdadeiro";
}else{
    echo"falso";
}
    
21.01.2016 / 14:59