What is the base64 encoding for?

12

Virtually every programming language one deserves has its implementation of encoding and decoding from a string to a string in Base64 characters.

But what is Base64 itself for?

Thank you!

    
asked by anonymous 05.03.2015 / 13:14

3 answers

16

Base64 is a method for encoding of data for download on the Internet (MIME encoding for content transfer). It is often used to transmit binary data by means of transmission that deal only with text, such as for sending attachments by e-mail.

It consists of 64 characters ([A-Za-z0-9], "/" and "+") that gave rise to its name. The "=" character is used as a special suffix and the original specification (RFC 989) defined that the "*" symbol can be used to delimit converted but unencrypted data within a stream.

Encoding example:

Texto original: hello world
Texto convertido para Base64: aGVsbG8gd29ybGQK

Base64 encoding is often used when there is a need to transfer and store binary data for a device designed to work with textual data. This coding is widely used by applications in conjunction with the XML markup language, enabling the storage of binary data in the form of text.

Font

    
05.03.2015 / 13:20
12

Sometimes you want to transfer some data in binary and you can not do this transfer because some media are made for text streamer.

As an example you have the following representation of data in an array:

nome = "Joao"
idade = 20

You can do this data transfer using the text form of this data, such as JSON

{"nome":"joao", "idade":20}

In the cases of data binários you can not simply take the value of them themselves and make this text representation, then enter Base64 .

To circumvent this situation, people encode their binary data in Base64 to be able to render this text representation for any type of transfer and use.

There are several other encoders that can be used, but the most common is Base64

    
05.03.2015 / 13:25
10
The US-ASCII character set has 95 "printable" characters, as well as 33 other control characters (0 through 31 and 127, or 00-1F and 7F in hexadecimal), originally used to control devices such as printers, etc. As the most "universal" encoding exists (practically all others, including Unicode or Windows " code pages", are supersets of this one), ASCII text sent from a source will probably be well accepted in any destination (and intermediaries) without data corruption. When a data (text or binary) can not be expressed in ASCII without modifications, it is sometimes desirable to encode it in an ASCII text before it is sent, decoding it again upon reaching its destination.

The highest power of 2 less than 95 is 64. At first one could try to code data in the same base 95, but this is complicated and often inefficient. The advantage of base 64 is that every 3 bytes (3 * 8 = 24 bits) of the input result in exactly 4 characters of the output (4 * 6 = 24 bits), so you can do the conversion from and to binary with a constant use of memory and fairly simple operations.

Base64 encoding uses all uppercase and lowercase letters and all digits, totaling 62 characters, plus two other ones chosen case by case, but traditionally being + and / . = is also widely used to delimit the end of data (when its size is not a multiple of 3), it is common to see a ( = ) or two ( == ) of that symbol at the end of base64 strings. A less common alternative - when you do not want to use anything other than letters and numbers, or case insensitivity - is base 32, which uses all letters plus numbers from 0 to 5.

    
05.03.2015 / 13:32