python

Count NaN values in a Pandas DataFrame

A DataFrame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. Each row has a unique identifier called an index. A DataFrame can have multiple columns, each of which can hold a different type of data. NaN, or "Not a Number", is a numeric data type that is used to represent missing data. NaN values are often generated when data is missing, or when there is an error in the data.

In this post, we will explain how to get the count of NaN values in an entire DataFrame or in a column or row.

Count NaN values in entire DataFrame

If you want to find the number of NaN values in a pandas dataframe, you can use the isna() and sum() functions together. The isna() function will return True for every element that is NaN, and sum() will then count the number of True values.

Code example 1 - using isna() and sum() functions

import pandas as pd

data = {
  "col_1": [10, None, 13, None, None, 40],
  "col_2": [5, 10, None, 15, 20, None],
  "col_3": [20, 30, 50, None, None, None]
}

df = pd.DataFrame(data)

print(df)

# this code will count the NaN values in the entire DataFrame
result = df.isna().sum().sum()

print("Total NaN values: ", result)

Output

+----+---------+---------+---------+
|    |   col_1 |   col_2 |   col_3 |
|----+---------+---------+---------|
|  0 |      10 |       5 |      20 |
|  1 |     nan |      10 |      30 |
|  2 |      13 |     nan |      50 |
|  3 |     nan |      15 |     nan |
|  4 |     nan |      20 |     nan |
|  5 |      40 |     nan |     nan |
+----+---------+---------+---------+

Total NaN values:  8

Explanation of the above code example

We create a dictionary with three key-value pairs, where the keys are "col_1", "col_2", and "col_3" and the values are lists of numbers.
We use the dictionary to create a pandas DataFrame called df.
We print the DataFrame df.
We use the DataFrame's isna() function to create a new DataFrame of booleans, then use the sum() method to sum up all the True values.
We print the total number of NaN values.

Code example 2 - using isnull() and sum() functions

import pandas as pd

data = {
  "col_1": [10, None, 13, None, None, 40],
  "col_2": [5, 10, None, 15, 20, None],
  "col_3": [20, 30, 50, None, None, None]
}

df = pd.DataFrame(data)

result = df.isnull().sum().sum()

print("Total NaN values: ", result)

Output

Total NaN values:  8

Code example 3 - Using axis and sum() function

import pandas as pd

data = {
  "col_1": [10, None, 13, None, None, 40],
  "col_2": [5, 10, None, 15, 20, None],
  "col_3": [20, 30, 50, None, None, None]
}

df = pd.DataFrame(data)

# Get total NaN values in every column
res = df.isnull().sum(axis = 0).sum()

print("Total NaN values in every column: ", res)

Output

Total NaN values in every column: 8

Total NaN values in every row

res = df.isnull().sum(axis = 1).sum()

print("Total NaN values in every row: ", res)

Output

Total NaN values in every row: 8

Count NaN values in a specific column of Dataframe

In Python, the isna() function can be used to check for missing values in a specific column of a Dataframe. The sum() function can then be used to count the number of missing values in that column. Here, we will show you to get the count of NaN values in a specific column.

Syntax

df['column_name'].isna().sum()

# or

df['column_name'].isnull().sum()

Code example

import pandas as pd

data = {
  "col_1": [10, None, 13, None, None, 40],
  "col_2": [5, 10, None, 15, 20, None],
  "col_3": [20, 30, 50, None, None, None]
}

df = pd.DataFrame(data)

nan_count = df['col_2'].isna().sum()

print("Total NaN values: ", nan_count)

Output

Total NaN values:  2

Count NaN values in a specific row of DataFrame

In order to count the number of NaN values in a specific row of a DataFrame, we need to first locate the row with the desired index, and then count the number of NaN values in that row.

Syntax

data.loc[row_index, :].isnull().sum()

# or

data.loc[row_index, :].isna().sum()

Code example - using row index

import pandas as pd

data = {
  "col_1": [10, None, 13, None, None, 40],
  "col_2": [5, 10, None, 15, 20, None],
  "col_3": [20, 30, 50, None, None, None]
}

df = pd.DataFrame(data)

print(df)

res = df.loc[1, :].isnull().sum()

print(res)

Output

╒════╤═════════╤═════════╤═════════╕
│    │   col_1 │   col_2 │   col_3 │
╞════╪═════════╪═════════╪═════════╡
│  0 │      10 │       5 │      20 │
├────┼─────────┼─────────┼─────────┤
│  1 │     nan │      10 │      30 │
├────┼─────────┼─────────┼─────────┤
│  2 │      13 │     nan │      50 │
├────┼─────────┼─────────┼─────────┤
│  3 │     nan │      15 │     nan │
├────┼─────────┼─────────┼─────────┤
│  4 │     nan │      20 │     nan │
├────┼─────────┼─────────┼─────────┤
│  5 │      40 │     nan │     nan │
╘════╧═════════╧═════════╧═════════╛

Total NaN values in 2nd row:  1

To get the count of NaN values in the 6th row you can use the below code

res = df.loc[5, :].isnull().sum()

Output

Total NaN values in 6th row:  2

Was this helpful?