Pandas DataFrame.loc () does not find the record

0

Good morning!

I'm trying to manipulate a DataFrame that originates in a DRE (accounting) report. I would like the index to be the account code, which I have already been able to do. However, DataFrame.loc [] does not find the record. Following:

import pandas as pd
import csv
from pandas import DataFrame

dre = pd.read_csv('/home/andre/Documentos/ambev_dre3.csv', names=['Conta',   'Descrição', '2017', '2016', '2015'], dtype={'Conta':str})
dre = dre.set_index('Conta')
dre

Itoccursthatdre.loc['3.02']returnserror:

---------------------------------------------------------------------------KeyErrorTraceback(mostrecentcalllast)~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.pyin_validate_key(self,key,axis)1789ifnotax.contains(key):->1790error()1791exceptTypeErrorase:~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.pyinerror()1784.format(key=key,->1785axis=self.obj._get_axis_name(axis)))1786KeyError:'thelabel[3.02]isnotinthe[index]'Duringhandlingoftheaboveexception,anotherexceptionoccurred:KeyErrorTraceback(mostrecentcalllast)<ipython-input-61-8aa3f3ce8015>in<module>()---->1dre.loc['3.02']~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.pyin__getitem__(self,key)14761477maybe_callable=com._apply_if_callable(key,self.obj)->1478returnself._getitem_axis(maybe_callable,axis=axis)14791480def_is_scalar_access(self,key):~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.pyin_getitem_axis(self,key,axis)19091910#fallthrutostraightlookup->1911self._validate_key(key,axis)1912returnself._get_label(key,axis=axis)1913~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.pyin_validate_key(self,key,axis)1796raise1797except:->1798error()17991800def_is_scalar_access(self,key):~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.pyinerror()1783raiseKeyError(u"the label [{key}] is not in the [{axis}]"
   1784                                .format(key=key,
-> 1785                                        axis=self.obj._get_axis_name(axis)))
   1786 
   1787             try:

KeyError: 'the label [3.02] is not in the [index]'

My error is probably quite primary, since I'm a beginner, but I've been trying to manipulate this data for hours!

Thank you for your attention!

    
asked by anonymous 25.10.2018 / 16:30

1 answer

0

The problem is that .set_index() does not change the index of DataFrame , but returns a new DataFrame with index changed.

Switch

dre.set_index('Conta')

To

dre = dre.set_index('Conta')

For example, I created this file teste.csv to test:

3.01,Receita de Venda,478,355,666
3.02,Custo dos bens,-123,34234,773
3.03,Resultado bruto,456,545,234

I circled this code:

import pandas as pd
dre = pd.read_csv('teste.csv', 
    names=['Conta', 'Descrição', '2017', '2016', '2015'],
    dtype={'Conta':str})
dre = dre.set_index('Conta')
print(dre.loc['3.02'])

The result, as expected:

Descrição    Custo dos bens
2017                   -123
2016                  34234
2015                    773
Name: 3.02, dtype: object

EDIT: Maybe it's the whitespace in your Conta field, within your csv file. Try removing them by putting the code below a line before set_index :

dre['Conta'] = dre['Conta'].str.strip()
    
25.10.2018 / 17:35