How do they relate to 'hash' which is the so-called 'dictionary' of python with the 'hash' function of encryption?

Question

How do they relate to 'hash' which is the so-called 'dictionary' of python with the 'hash' function of encryption?

Navigation

#1 by (10 votes)
#2 by (5 votes)
#3 by (4 votes)

7

I would like to understand how the 'hash' function of encryption (in which you encrypt for example passwords) is related, with the hash key-value in programming (also known as 'dictionary' in Python for example).

criptografia hash

asked by anonymous 21.02.2017 / 02:30

3 answers

5

hash is a mathematical algorithm that will take a string and transform it into another, so that it can not be reversed. The hash is usually used in encryption to save passwords, as you may have already guessed.

A data dictionary is a dynamic structure that allows you to store values through a key (usually string). To store these values it makes a hash, that is, takes the string key and applies the encryption hash algorithm, and discovers the position where the value is.

A very simple example of dictionary storage:

function buscaNoDicionario(string chave) {
  /* assumimos neste exemplo que a chaveReal será um valor numérico, indicando uma posição de memória */
  int chaveReal = algoritmoHash(chave);

  return arrayInterno[chaveReal];
}

It may seem strange to have a function that does these calculations, but this is usually managed by the language itself. In your day-to-day life you should write something like:

meuDicionario["nome"] = "Jean"
meuDicionario["idade"] = 5000

And at the time of running the program the compiler translates this to something like:

adicionaNoDicionario(meuDicionario, "nome", "Jean")
adicionaNoDicionario(meuDicionario, "idade", 5000)

Note that the hash used in the dictionary has very different requirements than is used in encryption. I can highlight some points:

Generally output a numeric value, indicating a memory location;
You have to do as much as possible to avoid collisions, that is, two different entries can not generate the same output;
It should be fast, as the idea is not to keep information secure, but to improve the way you access data.

24.02.2017 / 13:40

4

The dictionary is a hashes table. Each hash is obtained through a function that calculates its value to determine in which bucket should be entered. hash is just one way to make it easier to find what you want quickly. The actual value that was used to compute, the key, needs to be stored too if you need to know it.

Encryption calculates hash and only stores it, without the actual content, after all we want to hide the actual data.

For every need the hash size will vary and even so, but not only, the calculation formula is a bit different.

21.02.2017 / 02:43

Sending files in Nginx error "413 Request Entity Too Large" What are asynchronous processing and synchronous processing?

score 10 · Accepted Answer

The hash function, in general, is a function that receives arbitrary size data and transforms this data into a numeric alpha value.

As you've noticed, the hash function is used in different contexts within computing. Each context requires that the hash function obey (or not) certain types of properties.

These properties include determinism, definition of intervals, uniformity, invertibility, and treatment of collisions.

Determinism

A hash function should always generate the same value for an entry. Thus, the hash function closely matches the mathematical function model.

Some versions of python do not obey this property. This is because python generates a seed (random) that will be used in hashing. This kind of situation should be avoided if someone wants to work with persistence (write to disk). For, the values that were saved in an execution for a given data will be different from the values generated in a new execution.

Intervals

Some applications require the hash function to generate values within a fixed numeric range. An example of such an application is the SHA-1 encryption algorithm that generates a value of 160 bits. p>

Others require that the range be dynamic. The python dictionary, which uses the value generated by the hash function as the index of an array, expands as new key-value pairs are inserted.

Uniformity

Hash functions with defined intervals should ensure that each position in the range is equally likely to be generated. The reason for this is that two different data can generate the same value ( collision ). Collisions are expensive operations to be treated. Depending on the case, they do not even need to be addressed.

Invertibility

Encryption applications require that it be difficult to find a data from the value generated by a hashing function.

The implementation of a hash function varies greatly depending on the problem it should solve.

A simple example of HashCode is what has been implemented to generate a hash value in strings in Java (Useful to use in maps / dictionary):

public int hashCode() {
   int hash = 0;
   for (int i = 0; i < s.length(); i++)
      hash = (hash * 31) + charAt(i);
   return hash;
}

The value 31 was chosen because it is easy to implement using shifts and is a prime number (for some reason unknown prime numbers have a smaller number of collisions).

You can also take a look at the rabin-karp algorithm to see the hashing in a search algorithm in a text.

In your question you speak of the hash function used in encryption to "encrypt passwords". However, note that encryption is a different hashing process. When using hashing, the goal is to receive a data and generate a numerical alpha value for that data (a given value can be generated by different data). In the case of encryption, you will modify the data to make it unreadable for those who are unaware of the method used at the time of encryption. That is, encryption always has round trip guaranteed (for those who know the password ).