Use the set and for function in the same structure

0

I'm studying data science through the Python language and I came across the following code:

world_trends_set = set([trend['name'] for trend in world_trends[0]['trends']])
us_trends_set = set([trend['name'] for trend in us_trends[0]['trends']])

This example I removed from the book "Mining-the-Social-Web-2nd-Edition". I can not understand why trend['name'] before the loop. I imagine the code is creating a set named name with the items in the first row of the trends column.

Can anyone explain the advantage of using syntax in this way and correct me if I am wrong?

    
asked by anonymous 28.07.2017 / 19:24

1 answer

2

Considering the line:

world_trends_set = set([trend['name'] for trend in world_trends[0]['trends']])

The equivalent code would be:

temp = []

for trend in world_trends[0]['trends']:
    temp.append(trend['name'])

world_trends_set = set(temp)

The result produced by both codes will be exactly the same, so yes, what the code is doing is creating a set from a list. This list, in turn, is made up of the name column of the trend value. That is, world_trends is a list whose 0 position is also a list of dictionaries that have the name column.

Considering that you are first creating a list, ie storing all values in memory, and then converting to a set, not any performance gain or memory cost. The difference is more in the writing of the code: the first form is smaller and even more readable for humans than the second. That is, the first form is easier to understand.

Incidentally, you can even remove the brackets around the expression for as follows:

world_trends_set = set(trend['name'] for trend in world_trends[0]['trends'])

Within the parentheses we would have a generator expression being converted to a set. Also it would not present better performance, I believe, since converting the generator to a set would also store all the elements in memory, but would be a less operation for Python to process.     

28.07.2017 / 19:48