Convert encoding CP850 to UTF8

4

I have a Paradox BD that returns me the following string after a query in an industry table that a given worker gets: Maintenance. Actually it should be Electrical Maintenance. I need to return this string to a browser.

From my research, this is a CP850 encoding that I need to convert to UTF-8, which is the encoding that is commonly used. I saw this in the link:

link

I'm trying to do the following in C #:

Encoding utf8 = Encoding.UTF8;
Encoding cp = Encoding.GetEncoding(850);

byte[] cpBytes = cp.GetBytes(identifColaborador.setor);//aqui já vem como ManutenþÒo ElÚtrica 

byte[] utf8Bytes = Encoding.Convert(cp, utf8, cpBytes);
string msg = utf8.GetString(utf8Bytes);

But unfortunately I'm not getting success. It still returns in the string msg Electrical Maintenance

Where can I be wrong?

    
asked by anonymous 28.03.2014 / 02:38

1 answer

4

Your code has no effect on the returned string, since it is starting from an abstract representation and arriving at another abstract representation. I do not know if I can explain, I'll try to give a fictitious example:

// Letra (code point)          Encoding A            Encoding B
// a                           0xAA 0xBB             0xCC
// b                           0xDD                  0xEE oxFF

string original = "aaba";

byte[] aBytes = a.GetBytes(original);
// aBytes = [0xAA 0xBB 0xAA 0xBB 0xCC 0xAA 0xBB]

byte[] bBytes = Encoding.Convert(a, b, aBytes);
// bBytes = [0xDD 0xDD 0xEE oxFF 0xDD ]

string msg = b.GetString(bBytes);
// msg = "aaba"

Any string that goes through this process will continue unchanged (unless one of the encodings does not support any of the characters). To correct your problem, you need the contents of the identifColaborador.setor string to be interpreted in the correct encoding before you turn .

If this is not possible, and you have to work with the string already in its abstract representation, then the correct thing is to try to interpret the bytes that make up the string without doing conversion . That is, simply get aBytes and transform into string according to encoding B. The code below worked on ideone , but might not work on your system, so try different values for seuEncoding (UTF-8, UTF-16, Cp1252, ISO-Latin, Encoding.Default , etc.).

Encoding seuEncoding = Encoding.GetEncoding("Cp1252");
Encoding cp850 = Encoding.GetEncoding(850);

byte[] cpBytes = cp850.GetBytes("ManutenþÒo ElÚtrica");
string msg = seuEncoding.GetString(cpBytes);
    
28.03.2014 / 04:39