import pandas as pd
from tabulate import tabulate
# create a dataframe
df = pd.DataFrame({
'subject': ['Math', 'Physics', 'Chemistry', 'Data Structue', 'History'],
'score': [90, 80, 85, 40, 70],
'status': ['unknown', 'unknown', 'unknown', 'unknown', 'unknown']
})
print(df)
# change 'status' column value based on 'score' value
df.loc[(df['score'] > 50), 'status'] = 'Pass'
print(df)
Output
The first method will show the below DataFrame
╒════╤═══════════════╤═════════╤══════════╕
│ │ subject │ score │ status │
╞════╪═══════════════╪═════════╪══════════╡
│ 0 │ Math │ 90 │ unknown │
├────┼───────────────┼─────────┼──────────┤
│ 1 │ Physics │ 80 │ unknown │
├────┼───────────────┼─────────┼──────────┤
│ 2 │ Chemistry │ 85 │ unknown │
├────┼───────────────┼─────────┼──────────┤
│ 3 │ Data Structue │ 40 │ unknown │
├────┼───────────────┼─────────┼──────────┤
│ 4 │ History │ 70 │ unknown │
The second print method will show the below DataFrame
╒════╤═══════════════╤═════════╤══════════╕
│ │ subject │ score │ status │
╞════╪═══════════════╪═════════╪══════════╡
│ 0 │ Math │ 90 │ Pass │
├────┼───────────────┼─────────┼──────────┤
│ 1 │ Physics │ 80 │ Pass │
├────┼───────────────┼─────────┼──────────┤
│ 2 │ Chemistry │ 85 │ Pass │
├────┼───────────────┼─────────┼──────────┤
│ 3 │ Data Structue │ 40 │ unknown │
├────┼───────────────┼─────────┼──────────┤
│ 4 │ History │ 70 │ Pass │
╘════╧═══════════════╧═════════╧══════════╛
The code that we are using to change the values of the 'status' column is as below:
df.loc[(df['score'] > 50), 'status'] = 'Pass'
Based on the above code, we are checking if the value in the 'score' column is greater than 50. If the value is greater than 50 the value in the 'status' column will be replaced by the string 'Pass'. You can apply your conditions on the DataFrame based on the requirements.
The solutions that can be used to change the DataFrame column values based on some condition are as below:
There are times when we need to change the values of specific columns in our DataFrame, based on certain conditions. For instance, we might want to set a value in a column to 1 if the value in another column is greater than 6. We can do this using the DataFrame.loc[] method.
DataFrame.loc[] allows us to access and modify specific values in our DataFrame, based on either the index or column label. In this example, we'll use a column label. We can set a condition, such as a column B > 6, and then specify what we want to do with the values that meet that condition, such as setting the values in column C to 1.
This is a powerful method that can be used to clean and transform data in Pandas DataFrames.
Syntax
df.loc[condition, column_name] = value
Code example
import pandas as pd
# create a dataframe
df = pd.DataFrame({
'A': ['a', 'b', 'c', 'd', 'e'],
'B': [9, 10, 6, 2, 8],
'C': [0, 0, 0, 0, 0]
})
# change 'C' column value based on 'B' value
df.loc[(df['B'] > 6), 'C'] = 1
print(df)
Output
╒════╤═════╤═════╤═════╕
│ │ A │ B │ C │
╞════╪═════╪═════╪═════╡
│ 0 │ a │ 9 │ 1 │
├────┼─────┼─────┼─────┤
│ 1 │ b │ 10 │ 1 │
├────┼─────┼─────┼─────┤
│ 2 │ c │ 6 │ 0 │
├────┼─────┼─────┼─────┤
│ 3 │ d │ 2 │ 0 │
├────┼─────┼─────┼─────┤
│ 4 │ e │ 8 │ 1 │
╘════╧═════╧═════╧═════╛
In Python, we can use the DataFrame.where() function to change column values based on a condition. For example, if we have a DataFrame with two columns, "A" and "B", and we want to set all the values in column "A" to 0 if the value in column "B" is less than 0, we can use the DataFrame.where() function like this:
df['A'].where(~(df['B'] < 0), 0, inplace=True)
This will change all the values in column "A" to 0 if the value in column "B" is less than 0.
Syntax
df['column_name'].where(~(condition), other=value, inplace=True)
Code example
import pandas as pd
# create a dataframe
df = pd.DataFrame({
'A': ['a', 'b', 'c', 'd', 'e'],
'B': [9, 10, 6, 2, 8],
'C': [0, 0, 0, 0, 0]
})
# change 'C' column value based on 'B' value
df['C'].where(~(df['B'] > 6), other=1, inplace=True)
print(df)
Output
╒════╤═════╤═════╤═════╕
│ │ A │ B │ C │
╞════╪═════╪═════╪═════╡
│ 0 │ a │ 9 │ 1 │
├────┼─────┼─────┼─────┤
│ 1 │ b │ 10 │ 1 │
├────┼─────┼─────┼─────┤
│ 2 │ c │ 6 │ 0 │
├────┼─────┼─────┼─────┤
│ 3 │ d │ 2 │ 0 │
├────┼─────┼─────┼─────┤
│ 4 │ e │ 8 │ 1 │
╘════╧═════╧═════╧═════╛
The DataFrame.mask() function can be used to change the values in a DataFrame column based on a condition. This can be useful when you want to replace certain values in a column with a different value.
Syntax
df['column_name'].mask(condition, new_value, inplace=True)
Code example
import pandas as pd
# create a dataframe
df = pd.DataFrame({
'A': ['a', 'b', 'c', 'd', 'e'],
'B': [9, 10, 6, 2, 8],
'C': [0, 0, 0, 0, 0]
})
# below line will change the column values - condition based
df['C'].mask(df['B'] > 6, 1, inplace=True)
print(df)
Output
╒════╤═════╤═════╤═════╕
│ │ A │ B │ C │
╞════╪═════╪═════╪═════╡
│ 0 │ a │ 9 │ 1 │
├────┼─────┼─────┼─────┤
│ 1 │ b │ 10 │ 1 │
├────┼─────┼─────┼─────┤
│ 2 │ c │ 6 │ 0 │
├────┼─────┼─────┼─────┤
│ 3 │ d │ 2 │ 0 │
├────┼─────┼─────┼─────┤
│ 4 │ e │ 8 │ 1 │
The code above creates a DataFrame with three columns ('A', 'B', 'C'). The values in column 'C' are all initialized to 0. The code then uses the mask() method to update the values in column 'C'. The mask() method takes three arguments. The first argument is a condition - in this case, the condition is df['B'] > 6. The second argument is the value to use if the condition is True - in this case, the value is 1. The third argument is the value to use if the condition is False - in this case, the value is 0. So, the code above updates the values in column 'C' to 1 if the corresponding value in column 'B' is greater than 6, and updates the values in column 'C' to 0 if the corresponding value in column 'B' is less than or equal to 6.
0 Comments