Clang presents character error too large but Visual Studio compiles normal

3

I'm having trouble understanding why clang presents the error message

  

character too large for enclosing character literal type

when trying to run the code:

char c = 'ç';

While Visual Studio 2015 compiles seamlessly. I know that different compilers can and do have their different implementations. And that ç is outside of the ASCII table, that the numeric value must be greater than 127 so Clang informs that it can not store within type char . But I would still like to know:

Why does not Clang allow me to use 'ç' as a char while Visual Studio allows? Is something predefined in Visual Studio? Any option based on my system language?

Why does Visual Studio return the "correct" value in string functions, as strlen even passing strings with accents?

Example: strlen("opção"); Returns 5 in Visual Studio, I expected the return to be 7 as Clang returns.

    
asked by anonymous 24.06.2016 / 23:08

1 answer

3

First: compiling or working is different from being right. Every programmer should have this very clearly. You did well to ask here to understand why it works or not.

I do not have official information (I found some loose information that indicates this), but I can infer that the Microsoft C compiler, which happens to be used in Visual Studio, is using one-byte character encoding, possibly Windows 1252. This uses an extended ASCII table (which only has 127 characters) allowing for some accented characters. It will not work with characters that are outside this small 255-character table.

Recently the new compiler has better control how to handle it .

It's clear to me that Clang uses UTF-8 by default (I even read in some unofficial places that's it), which is a multi-byte encoding. When using characters other than the ASCII table it must be represented by 2 or more bytes.

This explains why char does not work, after all the C specification clearly states that this type must have 1 byte always.

The function strlen() returns the correct value, after all it is proposed to return the number of bytes, not of characters.

I'd still advise looking through the documentation to see if it's even standard coding.

Use wchar_t to ensure a multi-byte type. Or wstring in C ++.

Read about the C-string multi-byte functions .

    
24.06.2016 / 23:51