How can I optimize my access to the Python dictionary?

4

I have a dictionary of the following form:

dicionario = {'1':'Banana','7':'Maçã','3':'Pera','2':'Melancia'}

If I check the '2' key in the dictionary: if '2' in dicionario: . My programmer's intuition tells me that the complexity of this command is O(n) in the worst case, because there are n elements within that dictionary. However this is a dictionary and not a list and I do not know exactly how the "magic gears" of the python dictionary work.

The complexity of performing a search on a particular dictionary item is looking for the key as I did or accessing the direct element: dicionario['2'] has complexity O(n) ? Or is there a Hash within these "gears" or some structure that can optimize this search, such as binary search?

In talking about binary search, is there any way to access the dictionary element like I did and enable a binary search in the dictionary "gears" considering, of course, that it is sorted?

    
asked by anonymous 24.01.2018 / 18:17

2 answers

6

As far as I know the two ways to access a key in the dictionary are basically equivalent (of course one returns a boolean of existence and the other returns the value, but both get time O (1), in most cases extremes that may be O (N), but that do not even come close to occurring).

Dictionaries are spreadsheets , they perform nearly array for random access, it only loses a bit because it needs to compute the hash code before accessing the element. There are cases of key collisions where the search has to be made in another way, and can even arrive at complexity O ( N).

A dictionary is almost the opposite of a structure that allows binary search. A dictionary can not be either sorted (with specific rule), much less sorted (with given criteria) .

In the example I have questions if a array would not be the best choice.

Both the array , and the dictionary will have complexity O (N) to search for values.

I have already demonstrated that the dictionary is different from array .

It can be useful: Objects are similar to arrays? .

    
24.01.2018 / 18:35
2

As it is in the comments and in the other answer, dictionaries in Python are a hash table with search time O (1) - and the information about it is here: link

That said, about the code design you ask - chave in dict vs dict[chave] : access time is the same - except that if the key does not exist, the second form raises an exception.

When you are going to recupear a value that is not sure if it exists in the dictionary the recommended is to use the .get method:

meu_valor = dicionario.get("chave") 

In this case, if "key" does not exist, .get returns None and life goes on. The .get also accepts a second parameter that is returned when the key does not exist, instead of None.

This saves lines of code - of type

if chave in dicionario:
    valor = dicionario[chave]
else: 
    valor = None

or

try:
   valor = dicionario[chave]
except KeyError:
   valor = None

Of these, this second case would be the only one that would have any difference of performance, due to the creation time of the try / except context. See also% method of% of% with% s: they not only retrieve a default value if there is no - but in this case, they assign the new value. For example, a loop to create a dictionary with all the words in a text and the position of each word within it in a list could be:

palavras = texto.split()
posicoes = {}
for indice, palavra in enumerate(palavras):
    posicoes.setdefault(palavra, []).append(indice)

In this case, if the word has not yet been found, an empty list is created that is both assigned there and returned by the method. If .setdefault already exists, the existing list is returned.

Now, since you're interested in this and want to know about binary search, here's the interesting fact: Python has a well-defined interface for mapping objects - and it's easy to implement an object that behaves exactly like a dictionary, but with the internal algorithms you decide on. If you want to create a dictionary that is internally a binary tree, it's pretty easy - the recommended way to do this is to inherit the dict class. There are several classes in posicoes[palavra] that are the basis for implementing very useful and varied data structures. A binary tree would have O (log (n)) search, but with the advantage that you can have the keys sorted and use slices in it, for example. Check out: link

I have several "super-dictionaries" implemented in module collections.abc.MutableMapping - some are more toy, others I use even in production:

link

    
24.01.2018 / 23:49