Search code snippets, questions, articles...

Create DataFrame from Python List in Pandas

If you have a Python List and want to create a Pandas DataFrame from it, then you can use the methods and techniques explained in this post.
import pandas as pd

# create a List of Lists
data_list = [
  ["John", 30, "Math"],
  ["Tom", 50, "Physics"],
  ["Pettrick", 40, "Chemistry"],
  ["Travis", 35, "English"]
]

# create the dataframe from List
df = pd.DataFrame(data_list, columns=['name', 'score', 'subject'])
print(df)
Best JSON Validator, JSON Tree Viewer, JSON Beautifier at same place.

Output

+----+----------+---------+-----------+
|    | name     |   score | subject   |
|----+----------+---------+-----------|
|  0 | John     |      30 | Math      |
|  1 | Tom      |      50 | Physics   |
|  2 | Pettrick |      40 | Chemistry |
|  3 | Travis   |      35 | English   |
+----+----------+---------+-----------+

We can easily create a Pandas DataFrame from Python List. In the above code example:

1. Given a list - data_list that contains multiple items.

2. Created pandas DataFrame from the above list with column name - subjects.

df = pd.DataFrame(data_list, columns=['name', 'score', 'subject'])

3. Created DataFrame without a column name using the below code.

df = pd.DataFrame(data_list)

We use Pandas.DataFrame() function to create a DataFrame from a Python List. The list to Dataframe conversion can be helpful in many scenarios. If you have a large amount of data in a List format and you want to process it using pandas then you will have to convert the list to a DataFrame then you can apply functions and methods defined in the Pandas library.

Create DataFrame with custom indexes from a List

We know that when we create a DataFrame, it automatically creates indexes starting from 0. You can also assign a custom index to each row or you can also modify them after creating the DataFrame.

When we create a DataFrame from a Python List, we can also assign a custom index to the rows when creating the DataFrame. We will pass the index list to the index parameter of Pandas.DataFrame() function and it will assign the corresponding index to the rows of DataFrame.

import pandas as pd

# create a list
students = [
  ["John", 30, "Math"],
  ["Tom", 50, "Physics"],
  ["Pettrick", 40, "Chemistry"],
  ["Travis", 35, "English"]
]

# create the dataframe from list with custom indexes
df = pd.DataFrame(students, columns=['name', 'score', 'subject'], index=['a', 'b', 'c', 'd'])

print(df)

Output

+----+----------+---------+-----------+
|    | name     |   score | subject   |
|----+----------+---------+-----------|
| a  | John     |      30 | Math      |
| b  | Tom      |      50 | Physics   |
| c  | Pettrick |      40 | Chemistry |
| d  | Travis   |      35 | English   |
+----+----------+---------+-----------+

In the above code example:

  1. Import pandas library in your project file. We can use the code import pandas as pd to do that.
  2. Created a List named students that contain multiple lists. We will use this list to create our DataFrame.
  3. In the next step, we are creating the DataFrame with custom indexes using hte below code.
pd.DataFrame(students, columns=['name', 'score', 'subject'], index=['a', 'b', 'c', 'd'])

We are using the columns parameter in pd.DataFrame() function to add custom columns. You can use the output for the DataFrame process further.

Create a DataFrame from a dictionary created from multiple lists

If you want to create a DataFrame from multiple lists and do not want to use the zip() function of Python then you can create a dictionary from the multiple lists and then create the DataFrame from that dictionary. The creation of DataFrame from a dictionary is very easy and we can use the below code example to do that.

import pandas as pd

# create multiple lists
names = ['Manoj', 'Rom', 'Anuj', 'Sheetal']
age = [27, 32, 30, 31]
departments = ['IT', 'HR', 'Account', 'IT']

# create a dicitonay from the above lists
employees = {
  'name': names,
  'age': age,
  'department': departments
}

# create the DataFrame
df = pd.DataFrame(employees)
print(df)

Output

+----+---------+-------+--------------+
|    | name    |   age | department   |
|----+---------+-------+--------------|
|  0 | Manoj   |    27 | IT           |
|  1 | Rom     |    32 | HR           |
|  2 | Anuj    |    30 | Account      |
|  3 | Sheetal |    31 | IT           |
+----+---------+-------+--------------+
import pandas as pd

subjects = ["Math", "Physics", "Chemistry", "English", "Hindi"]

scores = [90, 80, 87, 80, 70]

df = pd.DataFrame(list(zip(subjects, scores)), columns=['subject', 'score'])

print(df)

# Output
#      subject  score
# 0       Math     90
# 1    Physics     80
# 2  Chemistry     87
# 3    English     80
# 4      Hindi     70
To create a DataFrame from multiple lists, we can use the above code snippet. We have two lists here - subjects and scores. We are using these lists variables to create our DataFrame.
Was this helpful?
0 Comments