Pandas: How to merge two data frames?

0

Good morning everyone! I am counting on your help again.

I have 2 CSV, as below:

# f1.csv
num   ano
76971  1975
76969  1975
76968  1975
76966  1975
76964  1975
76963  1975
76960  1975

and

# f2.csv
num   ano   dou  url
76971  1975 p1   http://exemplo.com/page1
76968  1975 p2   http://exemplo.com/page10
76966  1975 p2   http://exemplo.com/page100

How can I merge the two in this way?

# resultado esperado
num   ano   dou  url
76971  1975 p1   http://exemplo.com/page1
76969  1975
76968  1975 p2   http://exemplo.com/page10
76966  1975 p2   http://exemplo.com/page100
76964  1975
76963  1975
76960  1975
    
asked by anonymous 17.11.2018 / 11:41

1 answer

1

You have the most direct way, whose solution was inspired here (several examples sql converted to pandas) , in this case we want the left join or outer join in this case:

import pandas as pd

df1 = pd.read_csv('f1.csv')
df2 = pd.read_csv('f2.csv')

df = pd.merge(df1, df2, on=['num', 'ano'], how="left") # colocamos o ano só para ser ignorado, em vez disto podiamos fazer df2.drop(['ano'], axis=1, inplace=True) para dropar a coluna do ano de df2
print(df)

Output:

     num   ano  dou                         url
0  76971  1975   p1    http://exemplo.com/page1
1  76969  1975  NaN                         NaN
2  76968  1975   p2   http://exemplo.com/page10
3  76966  1975   p2  http://exemplo.com/page100
4  76964  1975  NaN                         NaN
5  76963  1975  NaN                         NaN
6  76960  1975  NaN                         NaN

If instead of NaN you want a string to do, you can only then:

df.fillna('', inplace=True)

STATEMENT

DOCS merge
DOCS fillna

    
17.11.2018 / 12:31