What is and what is a checksum for?

14

I was reading a question here on the site and I came across the term checksum .

Several other times I've seen this term being used when talking about transferring files or some important data.

After all, what is and what is the purpose of the checksum ? I would like some simple example (I give preference to C #, but it can be any language) of how to do a checksum .

    
asked by anonymous 01.03.2016 / 12:01

3 answers

16

In a nutshell, checksum is used to check, for example, whether a file is exactly the same file after a transfer. To verify that it has not been altered by a third party or is not corrupted.

The idea is to get all the bytes of a file and add them one by one and get a checksum value. After a transfer, this checksum value should be the same in both the file sent by the sender and the one received by the recipient. Even so, it can not be guaranteed that the file is exactly the same. So there are several ways to make that sum.

In my company, for example, we use md5sum that uses the MD5 algorithm to calculate the checksum so that our customers can ensure that they are using the correct version of our software , after downloading it just do the file verification with md5sum .

Example:

font

    
01.03.2016 / 12:06
12

As the colleagues have already explained in other answers, the checksum essentially serves to check the integrity of a data stream (whether it is in a radio broadcast, Internet, smoke signal, etc., or a file on the disk, sent to someone by email or available for download).

I just decided to respond because I would like to provide intuition behind the subject. The English checksum means exactly "sum" from / to "check", because the principle of algorithms is following:

It is intended to produce a numerical value, easily calculated on both sides of a transmission (ie, both the sender and the receiver), and which represents not only the content in the file but also the order in which that content is found.
  • Once this value is calculated by the sender, the file (or data packet) is transmitted together with the checksum value. The receiver recalculates the value for the received packet / file and compares it with the original value sent by the sender. If it is different, some transmission problem has occurred (for example, any byte has been changed, perhaps by noises in the transmission medium or even by bad faith of a third party).
  • And how does the value of a checksum represent the content and order in a dataset? There are several ways. A very innocent, and in practice only serves as a didactic illustration, is the following:

      

    It traverses the characters of the file / package from start to finish,   multiplying the value of the character (its ASCII value, for example)   by the index (the position of the character in the traversed sequence). That   result is then accumulated in a total value (the

    01.03.2016 / 14:47
    8

    Checksum checksum is intended to help ensure packet integrity of a communication or ensure that a file has not been corrupted.

    In the header, a pre-combined calculation is made based on all the significant bits of the packet and the result is also sent in the communication so that the other side can be compared.

    For example, if in the serial communication protocol, we send a command, two bytes of payload (command data) and the checksum byte by adding all the values, we would have to send the command following this rule. p>

    To send the command 0x01, and payload of 0x00, 0x00 we would have that the checksum is 0x01 and this would be sent at the end. By opening the bytes on the other side, you can be more sure that all bits are correct, because their sum also gave 0x01. If any bit changes, the checksum would not match.

        
    01.03.2016 / 12:46