import pandas as pd
# create a List of Lists
data_list = [
["John", 30, "Math"],
["Tom", 50, "Physics"],
["Pettrick", 40, "Chemistry"],
["Travis", 35, "English"]
]
# create the dataframe from List
df = pd.DataFrame(data_list, columns=['name', 'score', 'subject'])
print(df)
Output
+----+----------+---------+-----------+
| | name | score | subject |
|----+----------+---------+-----------|
| 0 | John | 30 | Math |
| 1 | Tom | 50 | Physics |
| 2 | Pettrick | 40 | Chemistry |
| 3 | Travis | 35 | English |
+----+----------+---------+-----------+
We can easily create a Pandas DataFrame from Python List. In the above code example:
1. Given a list - data_list that contains multiple items.
2. Created pandas DataFrame from the above list with column name - subjects.
df = pd.DataFrame(data_list, columns=['name', 'score', 'subject'])
3. Created DataFrame without a column name using the below code.
df = pd.DataFrame(data_list)
We use Pandas.DataFrame() function to create a DataFrame from a Python List. The list to Dataframe conversion can be helpful in many scenarios. If you have a large amount of data in a List format and you want to process it using pandas then you will have to convert the list to a DataFrame then you can apply functions and methods defined in the Pandas library.
We know that when we create a DataFrame, it automatically creates indexes starting from 0. You can also assign a custom index to each row or you can also modify them after creating the DataFrame.
When we create a DataFrame from a Python List, we can also assign a custom index to the rows when creating the DataFrame. We will pass the index list to the index parameter of Pandas.DataFrame() function and it will assign the corresponding index to the rows of DataFrame.
import pandas as pd
# create a list
students = [
["John", 30, "Math"],
["Tom", 50, "Physics"],
["Pettrick", 40, "Chemistry"],
["Travis", 35, "English"]
]
# create the dataframe from list with custom indexes
df = pd.DataFrame(students, columns=['name', 'score', 'subject'], index=['a', 'b', 'c', 'd'])
print(df)
Output
+----+----------+---------+-----------+
| | name | score | subject |
|----+----------+---------+-----------|
| a | John | 30 | Math |
| b | Tom | 50 | Physics |
| c | Pettrick | 40 | Chemistry |
| d | Travis | 35 | English |
+----+----------+---------+-----------+
In the above code example:
pd.DataFrame(students, columns=['name', 'score', 'subject'], index=['a', 'b', 'c', 'd'])
We are using the columns parameter in pd.DataFrame() function to add custom columns. You can use the output for the DataFrame process further.
If you want to create a DataFrame from multiple lists and do not want to use the zip() function of Python then you can create a dictionary from the multiple lists and then create the DataFrame from that dictionary. The creation of DataFrame from a dictionary is very easy and we can use the below code example to do that.
import pandas as pd
# create multiple lists
names = ['Manoj', 'Rom', 'Anuj', 'Sheetal']
age = [27, 32, 30, 31]
departments = ['IT', 'HR', 'Account', 'IT']
# create a dicitonay from the above lists
employees = {
'name': names,
'age': age,
'department': departments
}
# create the DataFrame
df = pd.DataFrame(employees)
print(df)
Output
+----+---------+-------+--------------+
| | name | age | department |
|----+---------+-------+--------------|
| 0 | Manoj | 27 | IT |
| 1 | Rom | 32 | HR |
| 2 | Anuj | 30 | Account |
| 3 | Sheetal | 31 | IT |
+----+---------+-------+--------------+
import pandas as pd
subjects = ["Math", "Physics", "Chemistry", "English", "Hindi"]
scores = [90, 80, 87, 80, 70]
df = pd.DataFrame(list(zip(subjects, scores)), columns=['subject', 'score'])
print(df)
# Output
# subject score
# 0 Math 90
# 1 Physics 80
# 2 Chemistry 87
# 3 English 80
# 4 Hindi 70
0 Comments