How to get the encoding type of a file?

0

Follow the code below:

string text = File.ReadAllText($@"{pathname}", Encoding.UTF8);

I have several txt files with different conditioning. Because here does not show special characters, because of different encoding.

Before running the File.ReadAllText line, how do I get the file type?

Example: ANSI, UNICODE, UTF-8, ETC ...

Something like this:

if (pathname == Encoding.ASCII)
{
    string text = File.ReadAllText($@"{pathname}", Encoding.ASCII);
}
else if (pathname == Encoding.UTF8)
{
    string text = File.ReadAllText($@"{pathname}", Encoding.UTF8);
}
    
asked by anonymous 02.11.2017 / 21:33

1 answer

1

You'll have to read the file to find out so it probably pays to read it otherwise. Using StreamReader and reading at least a part can discover with the CurrentEncoding ". But they say it is not reliable.

If you have difficulties with it, you can try using a library like chardetsharp , UDE , NCharDet , Architect Shack . I do not know them and I do not know how reliable they are.

You have answers on the OS with codes that try to get the job done: here , here and here .

If you want to understand more about the BOM .

You will always have cases that you can detect wrong.

    
02.11.2017 / 21:51