Pandas replace multiple values in Python [5 ways]

Do you want to replace multiple values from a dataframe? In this Pandas article, I will explain how Pandas replace multiple values in Python using different methods with some demonstrative examples.

To replace multiple values in Python using Pandas, a variety of methods can be employed. These include the df.replace() function for general replacements, map() for single-column modifications, conditional replacements using loc[], the versatile apply() with custom functions and the efficient use of np.where() for conditional logic.

Pandas replace multiple values in Python

There are five different methods to replace multiple values in Python Pandas:

  1. df.replace() function
  2. map() for single column
  3. Conditional Replacement with loc[] function
  4. Using apply() with a custom function
  5. Using np.where() function

Let’s see them one by one using some illustrative examples:

1. Replace values in multiple columns Pandas using df.replace() function

The df.replace() function in Pandas is straightforward and flexible, allowing us to replace a single value, or multiple values, or perform replacements based on conditions by passing a dictionary or a list in Python.

Here is the code, for Pandas replace multiple values in Python using df.replace() function.

import pandas as pd

df = pd.DataFrame({'State': ['NY', 'CA', 'FL', 'NY', 'TX']})
print("Before Replacement:\n", df)
df.replace({'NY': 'New York', 'CA': 'California', 'TX': 'Texas'}, inplace=True)
print("\nAfter Replacement:\n", df)

Output:

Before Replacement:
   State
0    NY
1    CA
2    FL
3    NY
4    TX

After Replacement:
         State
0    New York
1  California
2          FL
3    New York
4       Texas

Upon executing the code in Pycharm, the resulting output is illustrated in the screenshot below.

Pandas replace multiple values in Python

2. Pandas replace multiple values using map() for a single column

The map() function in Python is tailored for transforming values in a single column. It maps the existing values to new ones based on a dictionary provided, making it ideal for simple, direct replacements in individual columns.

READ:  Module 'tensorflow' has no attribute 'truncated_normal'

This is the use of the map() fucntion for Pandas replace multiple values in Python:

import pandas as pd

df = pd.DataFrame({'City': ['Los Angeles', 'New York', 'Houston', 'Miami']})
print("Before Replacement:\n", df)
df['State'] = df['City'].map({'Los Angeles': 'CA', 'New York': 'NY', 'Houston': 'TX', 'Miami': 'FL'})
print("\nAfter Replacement:\n", df)

Output:

Before Replacement:
           City
0  Los Angeles
1     New York
2      Houston
3        Miami

After Replacement:
           City State
0  Los Angeles    CA
1     New York    NY
2      Houston    TX
3        Miami    FL

The screenshot below showcases the output obtained after the code was executed in the Pycharm editor.

pandas replace multiple values Python

3. Pandas replace multiple columns using the loc[] function

The loc[] indexer is used for condition-based replacements in Python, allowing us to specify conditions under which replacements should occur. It’s powerful for more complex, conditional data manipulation within DataFrame columns.

Here is the use of the loc() indexer for Pandas replace multiple values in Python:

import pandas as pd

df = pd.DataFrame({'City': ['New York', 'San Francisco', 'Austin'], 'Population': [8419000, 881549, 978908]})
print("Before Replacement:\n", df)
df.loc[df['Population'] > 1000000, 'Population'] = 'Large'
print("\nAfter Replacement:\n", df)

Output:

Before Replacement:
             City  Population
0       New York     8419000
1  San Francisco      881549
2         Austin      978908

After Replacement:
             City Population
0       New York      Large
1  San Francisco     881549
2         Austin     978908

Below is a screenshot taken after the code was implemented in the Pycharm editor.

how to replace multiple values in pandas in Python

4. Replace multiple values Pandas using apply() with a custom function

The apply() method applies a custom function across a Pandas DataFrame axis (row-wise or column-wise) in Python. This method is versatile and can handle complex logic for replacements that cannot be easily defined by direct mapping or simple conditions.

This is the code for Pandas replace multiple values in Python using apply() method:

import pandas as pd

df = pd.DataFrame({'State': ['California', 'New York', 'Texas', 'Florida']})
print("Before Replacement:\n", df)
def classify_state(state):
    if state in ['California', 'Texas']:
        return 'West'
    else:
        return 'East'

df['Region'] = df['State'].apply(classify_state)
print("\nAfter Replacement:\n", df)

Output:

Before Replacement:
         State
0  California
1    New York
2       Texas
3     Florida

After Replacement:
         State Region
0  California   West
1    New York   East
2       Texas   West
3     Florida   East

Following the implementation of the code in the Pycharm editor, a screenshot is provided below.

how to replace multiple values in a column in pandas in Python

5. Replace multiple values in column Pandas using np.where() function

This method integrates the np.where() function in Python for conditional replacements in Pandas DataFrames. It’s particularly useful for vectorized conditional operations, allowing replacements based on a specified condition in an efficient, concise manner.

READ:  Matplotlib scatter plot legend

Here is the instance, for Pandas replace multiple values in Python using np.where() function:

import pandas as pd
import numpy as np

df = pd.DataFrame({'State': ['California', 'Montana', 'Texas', 'Alaska'], 'Population': [39538223, 1084225, 29145505, 731545]})
print("Before Replacement:\n", df)
df['Population Size'] = np.where(df['Population'] > 10000000, 'High Population', 'Low Population')
print("\nAfter Replacement:\n", df)

Output:

Before Replacement:
         State  Population
0  California    39538223
1     Montana     1084225
2       Texas    29145505
3      Alaska      731545

After Replacement:
         State  Population  Population Size
0  California    39538223  High Population
1     Montana     1084225   Low Population
2       Texas    29145505  High Population
3      Alaska      731545   Low Population

Upon executing the code in Pycharm, the resulting output is illustrated in the screenshot below.

replace multiple values in pandas in Python

6. Pandas replace multiple values with one

To replace multiple values with a single value in a Pandas DataFrame, we can use the df.replace() method in Python. This method is versatile and can be used to replace a list of values with a single replacement value.

import pandas as pd

df = pd.DataFrame({
    'State': ['California', 'Texas', 'Florida', 'New York', 'Illinois'],
    'Avg_Temperature': [72, 70, 88, 71, 65]
})
print("Before Replacement:\n", df)
df['Avg_Temperature'].replace([70, 71, 72], 'Warm', inplace=True)
print("\nAfter Replacement:\n", df)

Output:

Before Replacement:
         State  Avg_Temperature
0  California               72
1       Texas               70
2     Florida               88
3    New York               71
4    Illinois               65

After Replacement:
         State Avg_Temperature
0  California            Warm
1       Texas            Warm
2     Florida              88
3    New York            Warm
4    Illinois              65

The screenshot below showcases the output after the code has been implemented in the Pycharm editor.

df.replace multiple values in Python

Conclusion

Understanding the versatility of Pandas replace multiple values in Python is crucial. The various methods, including df.replace(), map() for single columns, conditional replacement with loc[], using the apply() with custom functions, and leveraging np.where(), each offer unique approaches to the common task of replacing multiple values.

READ:  PyTorch Flatten + 8 Examples

You may also like to read: