Why does a char array support a character like 'ç' and a variable char not?

8

Even though the char variable supports only the ASCII characters, why does the following code have the normal output when a value is entered with characters that are not part of ASCII as accented characters?

#include<iostream>
using namespace std;

int main(void)
{
    char test2[10];

    cin.get(test2, 10);

    cout << test2 << endl;

    return 0;
}

And yet, because if a char array accepts the input of that character type. Why does not char accept? How can I represent these characters in c ++? Should I use the wchar_t type? I already read some about the type in books, but since all the books were in English or were translated from English it seems that the authors did not pay much attention to the wchar_t type.

    
asked by anonymous 19.04.2015 / 19:06

1 answer

9

This has little to do with C or C ++ or another programming language.

Internally, the computer only knows numbers. To represent letters, use coding . Each person can make their own custom encoding.

There are several common encodings.

Many of these encodings use only the numbers of 0 to 255 (or -128 a 127 ), meaning they are 8-bit encodings.

In ASCII encoding (of only 7 bits) there is no representation for eg ã .

When the use of computers was extended it was necessary to extend the encodings used to represent more than 128 characters.

One of the new encodings created is called ISO-8859-1. In this encoding the ã has the code 227. For example, in ISO-8859-8 encoding, the same code 227 represents the ד character (Dalet).

So far so good. All coded numbers fit into 8 bits.

Obviously there is the problem of always having to know what coding was originally used to convert the numbers into characters. This problem often happened at the beginning of the internet when people from different countries exchanged emails, each using a different encoding.

In order to solve this problem of different encodings, a scheme of encoding more than 256 characters in a single encoding that is suitable for all countries, Unicode, was invented.

But the unicode codes are too large to fit into 8 bits. In addition, it is not possible to use the Unicode interface in the same way as the Unicode interface, but it can not be used in any way. ..., little-endian, big-endian, ..., ...)

    
19.04.2015 / 19:38