Why only one Encoding works in the algorithm?

1
It is as follows: I have an encryption module that encrypts a byte[] and exits another encrypted byte[] , and at the end of the output a checksum is placed; the checksum is a single byte generated by the application made by the asymmetric key, so you can check if the decryption hits the keys, both input and output.

The problem is that if I do byte[] -> byte[] it works perfectly, encrypting and decrypting. But if I convert these byte[] to Strings, they only work if I use an Encoding, and give Invalid checksum error if I use another encoding.

string TextoParaEncriptar = "Olá, mundo!";
string encriptado = cipher.EncryptString(TextoBytes); // ok, encripta normalmente
string decriptado = cipher.DecryptString(encriptado); // beleza também

Above the code, the decriptado field has a value of "Hello world!", but both methods used the Encoding.Default encoder, which is variable according to the running machine. Now if I specify the encoder, it gives error:

string TextoParaEncriptar = "Olá, mundo!";
string encriptado = cipher.EncryptString(TextoBytes, Encoding.ASCII); // ok, encripta normalmente
string decriptado = cipher.DecryptString(encriptado, Encoding.ASCII); // checksum inválido

These are the code of the methods to encrypt / describe strings:

    public string EncryptString(string inputString) => EncryptString(inputString, Encoding.Default);
    public string EncryptString(string inputString, Encoding byteEncoder)
    {
        byte[] strBytes = byteEncoder.GetBytes(inputString);
        EncryptByteArray(ref strBytes);
        return byteEncoder.GetString(strBytes);
    }
    public string DecryptString(string inputString) => DecryptString(inputString, Encoding.Default);
    public string DecryptString(string inputString, Encoding byteEncoder)
    {
        byte[] strBytes = byteEncoder.GetBytes(inputString);
        DecryptByteArray(ref strBytes);
        return byteEncoder.GetString(strBytes);
    }

Encrypt and decrypt codes:

    public void EncryptByteArray(ref byte[] inputData)
    {
        if (k == null || k.Length == 0) throw new NullReferenceException("Key cannot be emtpy.");
        if (inputData == null || inputData.Length == 0) return;
        CryrazCore processor = new CryrazCore() { Positions = k };
        {
            processor.ComputeByteArray(ref inputData, false);
            Array.Resize(ref inputData, inputData.Length + 1);
            byte checksum = processor.PushChecksun();
            {
                inputData[inputData.Length - 1] = checksum;
            }
        }
    }
    public void DecryptByteArray(ref byte[] inputData)
    {
        if (k == null || k.Length == 0) throw new NullReferenceException("Key cannot be emtpy.");
        if (inputData == null || inputData.Length == 0) return;
        CryrazCore processor = new CryrazCore() { Positions = k };
        byte dataChecksum = inputData[inputData.Length - 1];
        byte processorChecksum = processor.PushChecksun();
        if(dataChecksum != processorChecksum) throw new NullReferenceException("Invalid key for this data. Checksum check failed.");
        {
            inputData[inputData.Length - 1] = 0;
            Array.Resize(ref inputData, inputData.Length - 1);
            processor.ComputeByteArray(ref inputData, true);
        }
    }
  
  • processor.ComputeByteArray(ref byte[], boolean) : This is the method that processes byte-byte from byte[] received.
  •   
  • EncryptByteArray inserts the byte checksum at the end of the string, the DecryptByteArray method removes it before processing the decryption.
  •   

Why is it making a mistake, even using the same Encoding to encrypt and decrypt only when Encoding byteEncoder is not Encoding.Default ? How do I resolve this?

Update

If I use the Western European (ISO) Encoding ISO-8859 , which is SBCS (Single Byte Character Set) , which is one byte for each character, algorithm works normally. But I still do not understand.

The algorithm traverses all bytes received by GetBytes() and places a checksum at the end of that string of bytes, and then converts them to a String using a GetString(byte[]) so that it has been encrypted, after describing that same encrypted string, says the last byte has been changed.

    
asked by anonymous 27.07.2017 / 05:53

2 answers

5

The function of a character encoding is to encode a certain "text" to bytes and back to text. For an encoding to be complete, every possible text string must have a byte representation (an incomplete encoding only supports a subset of possible characters - such as ASCII). However, not every possible sequence of bytes must match valid text in any encoding . If you use an arbitrary string of bytes and try to convert to text, nothing coherent may come out of it.

So if you try to represent the encrypted byte sequence (which is indistinguishable from a sequence of random bytes) in an encoding any one there is a great chance these bytes will not represent any valid text. Especially in UTF-8, which has very strict rules as to what the first bits represent.

So, I suggest using another type of representation for your encrypted string - say base 64 or maybe hex. Example:

public string EncryptString(string inputString) => EncryptString(inputString, GetCryrazStringEncoder());
internal string EncryptString(string inputString, Encoding byteEncoder)
{
    byte[] strBytes = byteEncoder.GetBytes(inputString); // Texto usa encoding
    EncryptByteArray(ref strBytes);
    return Convert.ToBase64String(strBytes); // Bytes aleatórios usam base64
}

public string DecryptString(string inputString) => DecryptString(inputString, GetCryrazStringEncoder());
internal string DecryptString(string inputString, Encoding byteEncoder)
{
    byte[] strBytes = Convert.FromBase64String(inputString); // Bytes aleatórios usam base64
    DecryptByteArray(ref strBytes);
    return byteEncoder.GetString(strBytes); // Texto usa encoding
}
    
28.07.2017 / 21:49
2

Being very shallow: ASCII does not support accent.

If you expect to get accentuation - or text with binary characters - use Encoding.UTF8 .

See how they behave in simple conversions.

var original = "Olá, Hello World";
Console.WriteLine("Original: " + original);

var ascBytes = Encoding.ASCII.GetBytes(original);
var backFromASCII = Encoding.ASCII.GetString(ascBytes);
Console.WriteLine("ASCII: " + backFromASCII);

var utfBytes = Encoding.UTF8.GetBytes(original);
var backFromUTF8 = Encoding.UTF8.GetString(utfBytes);
Console.WriteLine("UTF8: " + backFromUTF8);

var iso8859 = Encoding.GetEncoding("ISO-8859-1");
var isoBytes = iso8859.GetBytes(original);
var backFromISO = iso8859.GetString(isoBytes);
Console.WriteLine("ISO-8859: " + backFromISO);

The output will be:

// Original: Olá, Hello World
// ASCII: Ol?, Hello World
// UTF8: Olá, Hello World
// ISO-8859: Olá, Hello World

See working at .NET Fiddle .

    
27.07.2017 / 10:10