Create DataFrame from Python List in Pandas
If you have a Python List and want to create a Pandas DataFrame from it, then you can use the methods and techniques explained in this post.
import pandas as pd
# create a List of Lists
data_list = [
["John", 30, "Math"],
["Tom", 50, "Physics"],
["Pettrick", 40, "Chemistry"],
["Travis", 35, "English"]
]
# create the dataframe from List
df = pd.DataFrame(data_list, columns=['name', 'score', 'subject'])
print(df)
Output
+----+----------+---------+-----------+
| | name | score | subject |
|----+----------+---------+-----------|
| 0 | John | 30 | Math |
| 1 | Tom | 50 | Physics |
| 2 | Pettrick | 40 | Chemistry |
| 3 | Travis | 35 | English |
+----+----------+---------+-----------+
We can easily create a Pandas DataFrame from Python List. In the above code example:
1. Given a list - data_list that contains multiple items.
2. Created pandas DataFrame from the above list with column name - subjects.
df = pd.DataFrame(data_list, columns=['name', 'score', 'subject'])
3. Created DataFrame without a column name using the below code.
df = pd.DataFrame(data_list)
We use Pandas.DataFrame() function to create a DataFrame from a Python List. The list to Dataframe conversion can be helpful in many scenarios. If you have a large amount of data in a List format and you want to process it using pandas then you will have to convert the list to a DataFrame then you can apply functions and methods defined in the Pandas library.
Create DataFrame with custom indexes from a List
We know that when we create a DataFrame, it automatically creates indexes starting from 0. You can also assign a custom index to each row or you can also modify them after creating the DataFrame.
When we create a DataFrame from a Python List, we can also assign a custom index to the rows when creating the DataFrame. We will pass the index list to the index parameter of Pandas.DataFrame() function and it will assign the corresponding index to the rows of DataFrame.
import pandas as pd
# create a list
students = [
["John", 30, "Math"],
["Tom", 50, "Physics"],
["Pettrick", 40, "Chemistry"],
["Travis", 35, "English"]
]
# create the dataframe from list with custom indexes
df = pd.DataFrame(students, columns=['name', 'score', 'subject'], index=['a', 'b', 'c', 'd'])
print(df)
Output
+----+----------+---------+-----------+
| | name | score | subject |
|----+----------+---------+-----------|
| a | John | 30 | Math |
| b | Tom | 50 | Physics |
| c | Pettrick | 40 | Chemistry |
| d | Travis | 35 | English |
+----+----------+---------+-----------+
In the above code example:
- Import pandas library in your project file. We can use the code import pandas as pd to do that.
- Created a List named students that contain multiple lists. We will use this list to create our DataFrame.
- In the next step, we are creating the DataFrame with custom indexes using hte below code.
pd.DataFrame(students, columns=['name', 'score', 'subject'], index=['a', 'b', 'c', 'd'])
We are using the columns parameter in pd.DataFrame() function to add custom columns. You can use the output for the DataFrame process further.
Create a DataFrame from a dictionary created from multiple lists
If you want to create a DataFrame from multiple lists and do not want to use the zip() function of Python then you can create a dictionary from the multiple lists and then create the DataFrame from that dictionary. The creation of DataFrame from a dictionary is very easy and we can use the below code example to do that.
import pandas as pd
# create multiple lists
names = ['Manoj', 'Rom', 'Anuj', 'Sheetal']
age = [27, 32, 30, 31]
departments = ['IT', 'HR', 'Account', 'IT']
# create a dicitonay from the above lists
employees = {
'name': names,
'age': age,
'department': departments
}
# create the DataFrame
df = pd.DataFrame(employees)
print(df)
Output
+----+---------+-------+--------------+
| | name | age | department |
|----+---------+-------+--------------|
| 0 | Manoj | 27 | IT |
| 1 | Rom | 32 | HR |
| 2 | Anuj | 30 | Account |
| 3 | Sheetal | 31 | IT |
+----+---------+-------+--------------+
import pandas as pd
subjects = ["Math", "Physics", "Chemistry", "English", "Hindi"]
scores = [90, 80, 87, 80, 70]
df = pd.DataFrame(list(zip(subjects, scores)), columns=['subject', 'score'])
print(df)
# Output
# subject score
# 0 Math 90
# 1 Physics 80
# 2 Chemistry 87
# 3 English 80
# 4 Hindi 70
- Pandas - How to check whether a pandas DataFrame is empty
- Pandas - Delete,Remove,Drop, column from pandas DataFrame
- Convert pandas DataFrame to List of dictionaries python
- Get column names from Pandas DataFrame as a python List
- Create pandas DataFrame and add columns and rows to it
- Get a column rows as a List in Pandas Dataframe
- Change column orders using column names list - Pandas Dataframe