In Python 3, with pandas, I have this dataframe with several codes in the columns "CPF_CNPJ_doador" and "CPF_CNPJ_doador_originario"
cand_doacoes = pd.read_csv("doacoes_csv.csv",sep=';',encoding = 'latin_1', decimal = ",")
cand_doacoes.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 427489 entries, 0 to 427488
Data columns (total 12 columns):
UF 427489 non-null object
Partido 427489 non-null object
Cargo 427489 non-null object
Nome_candidato 427489 non-null object
CPF_candidato 427489 non-null int64
CPF_CNPJ_doador 426681 non-null float64
Nome_doador 427489 non-null object
Nome_doador_Receita 427489 non-null object
Valor 427489 non-null float64
CPF_CNPJ_doador_originario 427489 non-null object
Nome_doador_originario 427489 non-null object
Nome_doador_originario_Receita 427489 non-null object
dtypes: float64(2), int64(1), object(9)
memory usage: 39.1+ MB
The codes in the columns "CPF_CNPJ_doador" and "CPF_CNPJ_doador_originario" are always integers and of different sizes: 14 digits, 13 digits, 11 digits or 10 digits.
I need to create a dataframe with only 14- and 13-digit codes. Please, does anyone know how I can select only the 14- and 13-digit codes in the "CPF_CNPJ_doador" column in the dataframe "cand_doacoes"? Do I need to convert to string?