How to Change a Column Value Based on Condition in Pandas
Pandas is a powerful data manipulation library in Python that is widely used for data analysis and cleaning. One common task in data analysis is to change the values in a column based on certain conditions. This can be done using the `apply()` function along with a custom function that checks the condition and returns the new value. In this article, we will explore how to change a column value based on condition in Pandas.
First, let’s start with a simple example. Suppose we have a DataFrame with a column named ‘Age’ and we want to change the value of ‘Age’ to ‘Adult’ if the value is greater than 18, and ‘Minor’ if the value is less than or equal to 18. Here’s how we can achieve this:
“`python
import pandas as pd
Create a sample DataFrame
data = {‘Age’: [15, 20, 17, 25, 30]}
df = pd.DataFrame(data)
Define a function to change the value based on condition
def change_age_value(age):
if age > 18:
return ‘Adult’
else:
return ‘Minor’
Apply the function to the ‘Age’ column
df[‘Age’] = df[‘Age’].apply(change_age_value)
print(df)
“`
Output:
“`
Age
0 Minor
1 Adult
2 Minor
3 Adult
4 Adult
“`
In the above example, we defined a function `change_age_value()` that takes an age value as input and returns ‘Adult’ if the age is greater than 18, and ‘Minor’ otherwise. We then used the `apply()` function to apply this function to each value in the ‘Age’ column of the DataFrame.
Another way to achieve the same result is by using the `numpy.where()` function. This function is often faster than `apply()` and can be used to perform conditional operations on a column. Here’s how we can use `numpy.where()` to change the ‘Age’ column values:
“`python
import numpy as np
Define a function to change the value based on condition
def change_age_value(age):
return ‘Adult’ if age > 18 else ‘Minor’
Use numpy.where() to apply the function to the ‘Age’ column
df[‘Age’] = np.where(df[‘Age’] > 18, ‘Adult’, ‘Minor’)
print(df)
“`
Output:
“`
Age
0 Minor
1 Adult
2 Minor
3 Adult
4 Adult
“`
In this example, we used the `numpy.where()` function to check if the value in the ‘Age’ column is greater than 18. If it is, the function returns ‘Adult’; otherwise, it returns ‘Minor’.
These are just a couple of examples of how to change a column value based on condition in Pandas. There are many other ways to achieve this, depending on the specific requirements of your data analysis task. By using functions like `apply()` and `numpy.where()`, you can easily manipulate and transform your data to meet your needs.