Do you want to replace multiple values from a dataframe? In this Pandas article, I will explain how Pandas replace multiple values in Python using different methods with some demonstrative examples.
To replace multiple values in Python using Pandas, a variety of methods can be employed. These include the df.replace() function for general replacements, map() for single-column modifications, conditional replacements using loc[], the versatile apply() with custom functions and the efficient use of np.where() for conditional logic.
Pandas replace multiple values in Python
There are five different methods to replace multiple values in Python Pandas:
- df.replace() function
- map() for single column
- Conditional Replacement with loc[] function
- Using apply() with a custom function
- Using np.where() function
Let’s see them one by one using some illustrative examples:
1. Replace values in multiple columns Pandas using df.replace() function
The df.replace() function in Pandas is straightforward and flexible, allowing us to replace a single value, or multiple values, or perform replacements based on conditions by passing a dictionary or a list in Python.
Here is the code, for Pandas replace multiple values in Python using df.replace() function.
import pandas as pd
df = pd.DataFrame({'State': ['NY', 'CA', 'FL', 'NY', 'TX']})
print("Before Replacement:\n", df)
df.replace({'NY': 'New York', 'CA': 'California', 'TX': 'Texas'}, inplace=True)
print("\nAfter Replacement:\n", df)
Output:
Before Replacement:
State
0 NY
1 CA
2 FL
3 NY
4 TX
After Replacement:
State
0 New York
1 California
2 FL
3 New York
4 Texas
Upon executing the code in Pycharm, the resulting output is illustrated in the screenshot below.
2. Pandas replace multiple values using map() for a single column
The map() function in Python is tailored for transforming values in a single column. It maps the existing values to new ones based on a dictionary provided, making it ideal for simple, direct replacements in individual columns.
This is the use of the map() fucntion for Pandas replace multiple values in Python:
import pandas as pd
df = pd.DataFrame({'City': ['Los Angeles', 'New York', 'Houston', 'Miami']})
print("Before Replacement:\n", df)
df['State'] = df['City'].map({'Los Angeles': 'CA', 'New York': 'NY', 'Houston': 'TX', 'Miami': 'FL'})
print("\nAfter Replacement:\n", df)
Output:
Before Replacement:
City
0 Los Angeles
1 New York
2 Houston
3 Miami
After Replacement:
City State
0 Los Angeles CA
1 New York NY
2 Houston TX
3 Miami FL
The screenshot below showcases the output obtained after the code was executed in the Pycharm editor.
3. Pandas replace multiple columns using the loc[] function
The loc[] indexer is used for condition-based replacements in Python, allowing us to specify conditions under which replacements should occur. It’s powerful for more complex, conditional data manipulation within DataFrame columns.
Here is the use of the loc() indexer for Pandas replace multiple values in Python:
import pandas as pd
df = pd.DataFrame({'City': ['New York', 'San Francisco', 'Austin'], 'Population': [8419000, 881549, 978908]})
print("Before Replacement:\n", df)
df.loc[df['Population'] > 1000000, 'Population'] = 'Large'
print("\nAfter Replacement:\n", df)
Output:
Before Replacement:
City Population
0 New York 8419000
1 San Francisco 881549
2 Austin 978908
After Replacement:
City Population
0 New York Large
1 San Francisco 881549
2 Austin 978908
Below is a screenshot taken after the code was implemented in the Pycharm editor.
4. Replace multiple values Pandas using apply() with a custom function
The apply() method applies a custom function across a Pandas DataFrame axis (row-wise or column-wise) in Python. This method is versatile and can handle complex logic for replacements that cannot be easily defined by direct mapping or simple conditions.
This is the code for Pandas replace multiple values in Python using apply() method:
import pandas as pd
df = pd.DataFrame({'State': ['California', 'New York', 'Texas', 'Florida']})
print("Before Replacement:\n", df)
def classify_state(state):
if state in ['California', 'Texas']:
return 'West'
else:
return 'East'
df['Region'] = df['State'].apply(classify_state)
print("\nAfter Replacement:\n", df)
Output:
Before Replacement:
State
0 California
1 New York
2 Texas
3 Florida
After Replacement:
State Region
0 California West
1 New York East
2 Texas West
3 Florida East
Following the implementation of the code in the Pycharm editor, a screenshot is provided below.
5. Replace multiple values in column Pandas using np.where() function
This method integrates the np.where() function in Python for conditional replacements in Pandas DataFrames. It’s particularly useful for vectorized conditional operations, allowing replacements based on a specified condition in an efficient, concise manner.
Here is the instance, for Pandas replace multiple values in Python using np.where() function:
import pandas as pd
import numpy as np
df = pd.DataFrame({'State': ['California', 'Montana', 'Texas', 'Alaska'], 'Population': [39538223, 1084225, 29145505, 731545]})
print("Before Replacement:\n", df)
df['Population Size'] = np.where(df['Population'] > 10000000, 'High Population', 'Low Population')
print("\nAfter Replacement:\n", df)
Output:
Before Replacement:
State Population
0 California 39538223
1 Montana 1084225
2 Texas 29145505
3 Alaska 731545
After Replacement:
State Population Population Size
0 California 39538223 High Population
1 Montana 1084225 Low Population
2 Texas 29145505 High Population
3 Alaska 731545 Low Population
Upon executing the code in Pycharm, the resulting output is illustrated in the screenshot below.
6. Pandas replace multiple values with one
To replace multiple values with a single value in a Pandas DataFrame, we can use the df.replace() method in Python. This method is versatile and can be used to replace a list of values with a single replacement value.
import pandas as pd
df = pd.DataFrame({
'State': ['California', 'Texas', 'Florida', 'New York', 'Illinois'],
'Avg_Temperature': [72, 70, 88, 71, 65]
})
print("Before Replacement:\n", df)
df['Avg_Temperature'].replace([70, 71, 72], 'Warm', inplace=True)
print("\nAfter Replacement:\n", df)
Output:
Before Replacement:
State Avg_Temperature
0 California 72
1 Texas 70
2 Florida 88
3 New York 71
4 Illinois 65
After Replacement:
State Avg_Temperature
0 California Warm
1 Texas Warm
2 Florida 88
3 New York Warm
4 Illinois 65
The screenshot below showcases the output after the code has been implemented in the Pycharm editor.
Conclusion
Understanding the versatility of Pandas replace multiple values in Python is crucial. The various methods, including df.replace(), map() for single columns, conditional replacement with loc[], using the apply() with custom functions, and leveraging np.where(), each offer unique approaches to the common task of replacing multiple values.
You may also like to read:
- Pandas iterrows update value in Python
- Pandas iterrows in Python
- Pandas Replace Multiple Values in Column based on Condition in Python
- 11 Ways to Filter Pandas DataFrame
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.