When working with multiple datasets in Python, I often need to combine information from different dataframes. One common task is adding a column from one dataframe to another.
This might seem simple, but there are several methods to accomplish this, each with its advantages.
In this article, I’ll walk you through six practical methods to add a column from another dataframe in Pandas. So let’s get in..
Methods to Add a Column From Another Dataframe In Pandas Python
Now, I will explain how to add a column from another Dataframe in Pandas Python.
1: Use pandas.DataFrame.join() Method
The join() method in Python is one of my go-to approaches when I need to add columns from another dataframe. It’s particularly useful when you want to join dataframes based on their indices.
Let’s start with a practical example. Imagine we have two dataframes containing information about sales representatives and their performance:
import pandas as pd
# Sales team dataframe
sales_team = pd.DataFrame({
'Rep_ID': [101, 102, 103, 104, 105],
'Name': ['John Smith', 'Sarah Johnson', 'Mike Davis', 'Emma Wilson', 'Robert Brown'],
'Region': ['East', 'West', 'North', 'South', 'Central']
})
# Sales performance dataframe
sales_performance = pd.DataFrame({
'Rep_ID': [101, 102, 103, 104, 105],
'Monthly_Sales': [15000, 22000, 18000, 25000, 19500],
'Targets_Met': [3, 4, 3, 5, 4]
})
# Set Rep_ID as index for both dataframes
sales_team.set_index('Rep_ID', inplace=True)
sales_performance.set_index('Rep_ID', inplace=True)
# Add Monthly_Sales column from sales_performance to sales_team
result = sales_team.join(sales_performance['Monthly_Sales'])
print(result)Output:
Name Region Monthly_Sales
Rep_ID
101 John Smith East 15000
102 Sarah Johnson West 22000
103 Mike Davis North 18000
104 Emma Wilson South 25000
105 Robert Brown Central 19500I executed the above example code and added the screenshot below

The join() method works beautifully when both dataframes share the same index. It’s clean and efficient for this specific scenario.
2: Use pandas.concat() Method
Python’s concat() method is extremely versatile for combining dataframes. When I need to add columns from one dataframe to another, especially when they share the same structure and index, concat() is often my choice.
Here’s how to use it:
import pandas as pd
# Reset indices to work with concat
sales_team = sales_team.reset_index()
sales_performance = sales_performance.reset_index()
# Using concat to add Monthly_Sales column
result = pd.concat([sales_team, sales_performance['Monthly_Sales']], axis=1)
print(result)Output:
Rep_ID Name Region Monthly_Sales
0 101 John Smith East 15000
1 102 Sarah Johnson West 22000
2 103 Mike Davis North 18000
3 104 Emma Wilson South 25000
4 105 Robert Brown Central 19500I executed the above example code and added the screenshot below

The concat() method combined our dataframes along axis=1 (columns). This works well when dataframes have the same length and order, which is common in many data analysis scenarios.
3: Use pandas.DataFrame.merge() Method
The merge() method in Python is my preferred approach when I need more control over how dataframes are combined. It’s particularly useful for scenarios involving different types of joins (inner, outer, left, right).
Let’s see it in action:
import pandas as pd
# Create our sample dataframes again
sales_team = pd.DataFrame({
'Rep_ID': [101, 102, 103, 104, 105],
'Name': ['John Smith', 'Sarah Johnson', 'Mike Davis', 'Emma Wilson', 'Robert Brown'],
'Region': ['East', 'West', 'North', 'South', 'Central']
})
sales_performance = pd.DataFrame({
'Rep_ID': [101, 102, 103, 104, 105],
'Monthly_Sales': [15000, 22000, 18000, 25000, 19500],
'Targets_Met': [3, 4, 3, 5, 4]
})
# Merge dataframes on Rep_ID to add Monthly_Sales column
result = sales_team.merge(sales_performance[['Rep_ID', 'Monthly_Sales']],
on='Rep_ID',
how='left')
print(result)Output:
Rep_ID Name Region Monthly_Sales
0 101 John Smith East 15000
1 102 Sarah Johnson West 22000
2 103 Mike Davis North 18000
3 104 Emma Wilson South 25000
4 105 Robert Brown Central 19500I executed the above example code and added the screenshot below

The merge() method is incredibly flexible. In this example, I used a left join to ensure all records from sales_team are preserved. This is especially useful when dealing with datasets that might not have perfect one-to-one relationships.
4: Use pandas.Series.map() Method
The map() method in Python is elegant when you need to map values from one dataframe to another based on a key. I frequently use this approach for simple column additions.
Here’s how it works:
import pandas as pd
# Create our sample dataframes
sales_team = pd.DataFrame({
'Rep_ID': [101, 102, 103, 104, 105],
'Name': ['John Smith', 'Sarah Johnson', 'Mike Davis', 'Emma Wilson', 'Robert Brown'],
'Region': ['East', 'West', 'North', 'South', 'Central']
})
sales_performance = pd.DataFrame({
'Rep_ID': [101, 102, 103, 104, 105],
'Monthly_Sales': [15000, 22000, 18000, 25000, 19500],
'Targets_Met': [3, 4, 3, 5, 4]
})
# Create a mapping dictionary from sales_performance
sales_map = dict(zip(sales_performance.Rep_ID, sales_performance.Monthly_Sales))
# Add Monthly_Sales to sales_team using map
sales_team['Monthly_Sales'] = sales_team['Rep_ID'].map(sales_map)
print(sales_team)Output:
Rep_ID Name Region Monthly_Sales
0 101 John Smith East 15000
1 102 Sarah Johnson West 22000
2 103 Mike Davis North 18000
3 104 Emma Wilson South 25000
4 105 Robert Brown Central 19500The map() method is straightforward and efficient, especially for simple one-to-one mappings. It’s also very readable, making code maintenance easier.
Read Python Pandas Write to Excel
5: Use DataFrame.assign() Method
The assign() method provides a clean, functional approach to adding columns. I often use it when I want to maintain immutability in my data processing pipeline.
Here’s how to use assign():
import pandas as pd
# Create our sample dataframes
sales_team = pd.DataFrame({
'Rep_ID': [101, 102, 103, 104, 105],
'Name': ['John Smith', 'Sarah Johnson', 'Mike Davis', 'Emma Wilson', 'Robert Brown'],
'Region': ['East', 'West', 'North', 'South', 'Central']
})
sales_performance = pd.DataFrame({
'Rep_ID': [101, 102, 103, 104, 105],
'Monthly_Sales': [15000, 22000, 18000, 25000, 19500],
'Targets_Met': [3, 4, 3, 5, 4]
})
# Create a mapping dictionary
sales_map = dict(zip(sales_performance.Rep_ID, sales_performance.Monthly_Sales))
# Use assign to add the column
result = sales_team.assign(Monthly_Sales=lambda x: x.Rep_ID.map(sales_map))
print(result)Output:
Rep_ID Name Region Monthly_Sales
0 101 John Smith East 15000
1 102 Sarah Johnson West 22000
2 103 Mike Davis North 18000
3 104 Emma Wilson South 25000
4 105 Robert Brown Central 19500The assign() method doesn’t modify the original dataframe, which is great for maintaining data integrity throughout your analysis process.
Check out Create Plots Using Pandas crosstab() in Python
6: Use DataFrame.loc[] Method
The loc[] method gives you direct, intuitive access to add columns. I find it particularly useful when I need fine-grained control over how data is added.
Let’s see it in action:
import pandas as pd
# Create our sample dataframes
sales_team = pd.DataFrame({
'Rep_ID': [101, 102, 103, 104, 105],
'Name': ['John Smith', 'Sarah Johnson', 'Mike Davis', 'Emma Wilson', 'Robert Brown'],
'Region': ['East', 'West', 'North', 'South', 'Central']
})
sales_performance = pd.DataFrame({
'Rep_ID': [101, 102, 103, 104, 105],
'Monthly_Sales': [15000, 22000, 18000, 25000, 19500],
'Targets_Met': [3, 4, 3, 5, 4]
})
# Set Rep_ID as index for both dataframes for easier alignment
sales_team.set_index('Rep_ID', inplace=True)
sales_performance.set_index('Rep_ID', inplace=True)
# Add Monthly_Sales column using loc
sales_team.loc[:, 'Monthly_Sales'] = sales_performance['Monthly_Sales']
print(sales_team.reset_index()) # Reset index for displayOutput:
Rep_ID Name Region Monthly_Sales
0 101 John Smith East 15000
1 102 Sarah Johnson West 22000
2 103 Mike Davis North 18000
3 104 Emma Wilson South 25000
4 105 Robert Brown Central 19500The loc[] method provides a clear, direct way to add columns, especially when working with indexed dataframes.
Read Drop the Header Row of Pandas DataFrame
Real-World Example: Combine Customer and Order Data
Let’s look at a more realistic scenario. Imagine we’re analyzing an e-commerce dataset with customer information in one dataframe and their order history in another:
import pandas as pd
# Customer information dataframe
customers = pd.DataFrame({
'Customer_ID': [1001, 1002, 1003, 1004, 1005],
'Name': ['Alice Cooper', 'Bob Dylan', 'Charlie Parker', 'Diana Ross', 'Elton John'],
'State': ['California', 'New York', 'Texas', 'Florida', 'Illinois'],
'Signup_Date': ['2022-01-15', '2022-02-20', '2022-01-30', '2022-03-05', '2022-02-10']
})
# Order history dataframe
orders = pd.DataFrame({
'Customer_ID': [1001, 1002, 1003, 1004, 1005],
'Total_Orders': [12, 8, 15, 6, 10],
'Avg_Order_Value': [85.50, 120.75, 65.30, 210.15, 95.80],
'Last_Order_Date': ['2023-04-10', '2023-04-05', '2023-04-12', '2023-03-28', '2023-04-08']
})
# Let's add the Average Order Value to our customer data using merge
enhanced_customers = customers.merge(
orders[['Customer_ID', 'Avg_Order_Value', 'Total_Orders']],
on='Customer_ID',
how='left'
)
print(enhanced_customers)Output:
Customer_ID Name State Total_Orders
0 1001 Alice Cooper California 12.0
1 1002 Bob Dylan New York 8.0
2 1003 Charlie Parker Texas 0.0
3 1004 Diana Ross Florida 6.0
4 1005 Elton John Illinois 10.0Using a left join ensures all customers are included, and fillna() helps handle missing values appropriately.
Check out np.where in Pandas Python
Performance Considerations of All the Methods
When working with large datasets, performance becomes crucial. Here’s a quick comparison of methods based on my experience:
- For small to medium datasets:
merge()andjoin()are generally most intuitive and flexiblemap()is excellent for simple key-based lookups
- For large datasets:
merge()with specified columns (rather than entire dataframes) helps reduce memory usage- Consider using the
pd.options.mode.chained_assignment = Nonesetting for complex operations
- For very large datasets:
- Consider using dask or vaex libraries for out-of-memory processing
- Chunking operations can help manage memory constraints
Remember that clean, well-structured data is the foundation of good analysis. Taking the time to properly combine your dataframes will pay dividends when you move to the modeling and visualization stages of your data science workflow.
By mastering these techniques, you’ll be able to integrate data from multiple sources efficiently, creating rich, unified datasets that reveal deeper insights.
Related tutorials:
- Pandas GroupBy Without Aggregation Function in Python
- Pandas Merge Fill NAN with 0 in Python
- Pandas Find Duplicates in Python

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.