Counting 'nan' and 'null' values in a pandas dataframe

Question

Counting 'nan' and 'null' values in a pandas dataframe

Navigation

#1 by (3 votes)
#2 by (1 votes)

2

Imagine that we have a CSV file called data.csv:

col1    col2    col3   col4
1        2        3        4
5       6         7        8
9      10        11       12
13     14        15    
33    44

import numpy as np
import pandas as pd

po = pd.read_csv('/dados.csv')

My goal is to better understand how to identify Nan / null data in a dataset.

Questions:

1. How to count how many 'nan' data exist in the above dataset?

2. How to count how many null data exist in the above dataset?

3. How to count how many data does not 'nan' exist in the above dataset?

4. How to count how many non-null data exist in the above dataset?

And the same questions as above but per column?

I tried, for example,:

po[po['col4'].isna()].count()

Thinking about how many 'nan' bills exist in column 4, but the answer was:

col1    2
col2    2
col3    1
col4    0
dtype: int64

What's wrong? How to answer the above questions?

python-3.x pandas

asked by anonymous 24.06.2018 / 21:52

2 answers

Convert milliseconds to 16-digit hexadecimal Javascript or Jquery - how to do an inventory of divs classes

score 3 · Accepted Answer

What's wrong?

The function count() does not count null data (for each column or row), the correct use of it is:

Non-null data count of all columns

print(po.count())

the output will be:

col1    5
col2    5
col3    4
col4    3
dtype: int64

Non-null data count of a specific column
```
print(po.col4.count())
```
the output will be:
```
3
```

See working at repl.it

To make the missing data count, you can use the

28.06.2018 / 02:55