How to concatenate two Dataframes in Python

In this Python tutorial, we will discuss several methods to concatenate two Dataframes in Python. Moreover, we’ll look at various examples to concatenate two Dataframes in Python.

Recently, I have been working on a machine learning project and I have found that it requires two data frames in inputs and combines them into one dataframe. So I researched and found that we have to concatenate two Dataframes in Python.

Here we will learn

  • How to concatenate two Dataframes in Python using concat()
  • How to concatenate two Dataframes in Python using dataframe.append()
  • How to concatenate two Dataframes in Python using dataframe.merge()

Concatenate two Dataframes in Python

In Python, there are primarily three methods that are commonly used and important to understand when concatenating two dataframes in Python.

How to concatenate two Dataframes in Python using concat() function

  • In this section, we will discuss How to concatenate two Dataframes in Python using the concat() function.
  • To combine/concatenate two or more pandas DataFrames across rows or columns, use pandas.concat(). When you concat() two pandas DataFrames on rows, it generates a new DataFrame with all the rows from the two DataFrames; in other words, it appends one DataFrame to another.
  • This function performs set logic on a different axis while merging the data on one axis (row or column) (another index).

Syntax:

Let’s have a look at the syntax and understand the working of pandas.concat() in Python.

pandas.concat
                     (
                      objs,
                      axis=0, 
                      join='outer', 
                      join_axes=None, 
                      ignore_index=False, 
                      keys=None, 
                      levels=None,
                      names=None, 
                      verify_integrity=False,
                      sort=None,
                      copy=True
                    )
  • It consists of a few parameters
    • objs: concatenate a series of Pandas objects.
    • axis: By default, it takes a 0 value and specifies the appropriate handling of the indices on the other axis.
    • join: By default, it takes the outer value.
    • ignore_index: By default, it takes the false value if it is True, a set of indexes will be used instead of the source objects’ indexes.
    • keys: a series to update the result indexes’ identifiers. It is useful for identifying the source items in the output.
    • levels: a list of the various level needed to construct multi-index.
    • names: for the levels of the generated hierarchical index’s names.
    • verify_integrity: Verify if there are any duplicates in the newly concatenated axis.
    • sort: By default, it takes none value if the join is ‘outer’ and it is not already aligned.
    • copy: Don’t copy data excessively if it is False.

Example:

Here we will take an example and check how to concatenate two Dataframes in Python using concat() function.

Source Code:

import pandas as pd
	
data_frame = pd.DataFrame(
	[['U.S.A', 745],
	['Austarlia', 664],
	['Germany', 178]],
	columns=['Country_name', 'values'])

data_frame_2 = pd.DataFrame(
	[['New York', 342],
	['Sydney', 145],
	['Berlin', 980]],
	columns= ['Cities_name', 'values'])

new_list = [data_frame, data_frame_2]

#concatenate dataframes
result = pd.concat(new_list, sort=False)
print(result)

In the following given code first, we imported the Pandas library and then created a first dataframe by using the pd.dataframe() function and within this function, we assigned the country_name along with the value.

Next, we will create another dataframe by using the pd.dataframe() function and within this function, we assigned the elements as a cities name of a particular country name. After that, we used the pd.concat() function and assign the list with the sort parameter.

Here is the Screenshot of the following given code

How to concatenate two Dataframes in Python using concat function
How to concatenate two Dataframes in Python using concat function

This is how to concatenate two Dataframes in Python using concat().

Read: How to convert a dictionary into a string in Python

How to concatenate two Dataframes in Python using dataframe.append()

  • Now let us discuss how to concatenate two Dataframes in Python using dataframe.append().
  • To add rows from another dataframe to the end of the one that is provided, use the Pandas append() function, which creates a new dataframe object as a result. The original DataFrame is updated with the new columns and cells that have NaN values.
  • Using the append() method of DataFrame, let’s append elements from one DataFrame to another. It will create a new DataFrame and append all the elements as new rows at the end.

Syntax:

Let’s have a look at the Syntax and understand the working of dataframe.append() in Python.

DataFrame.append
                           (
                            other,
                            ignore_index=False,
                            verify_integrity=False,
                            sort=None
                           )  
  • It consists of a few parameters
    • other: The data which we want to be appended.
    • ignore_index: If it is true, it does not make use of the index labels and by default,, it takes a false value.
    • verify_integrity: By default, it takes the false value. If it is true then it will; raise ValueError when creating an index containing duplicates.
    • sort: If the columns for self and others are not aligned, sort the columns, and by default, it takes none value.

Example:

Let’s take an example and check how to concatenate two Dataframes in Python using dataframe.append().

Source Code:

import pandas as pd
	
data_frame = pd.DataFrame(
	[['BMW', 4562],
	['Harley Davidson', 8945],
	['Kawasaki', 4509]],
	columns=['Bikes_name', 'values'])

data_frame_2 = pd.DataFrame(
	[['Volkswagen', 4678],
	['Chevrolet Silverado', 3457],
	['	Ford F-150', 1567]],
	columns= ['Cars_name', 'values'])


result = data_frame.append(data_frame_2, ignore_index = True)
print(result) 

In the above code first, we imported the Pandas library and then used the pd.dataframe() function and within this function, we assigned the elements as a string and integer along with that we mentioned the column name.

Here is the implementation of the following given code.

How to concatenate two Dataframes in Python using dataframe.append
How to concatenate two Dataframes in Python using dataframe.append

As you can see in the Screenshot we have discussed How to concatenate two Dataframes in Python using dataframe.append().

Read: How to Check if a String contains a Substring in Python

How to concatenate two Dataframes in Python using dataframe.merge()

  • In this section, we will discuss how to concatenate two Dataframes in Python using dataframe.merge().
  • The process of combining two datasets into one and aligning the rows according to shared attributes or columns is described by the Pandas merge() function.
  • Pandas merge() is the procedure for combining two datasets into one and aligning the rows according to shared attributes or columns. It serves as the starting point for all common database join operations using DataFrame objects.

Syntax:

Here is the Syntax of the dataframe.merge() function in Python

pd.merge
             (
              left,
              right,
              how='inner',
              on=None, 
              left_on=None,
              right_on=None,  
              left_index=False, 
              right_index=False,
              sort=True
             )  
  • It consists of a few parameters
    • left: This parameter indicates it uses only keys from the left frame.
    • right: Similar to a SQL right outer join, it only uses keys from the right frame and maintains key order.
    • how: By default, it takes the ‘inner’ value. The parameter that defines the merging operation type most critically is this one. They resemble the left outer join, right outer join, full outer join, and inner join operations of SQL.
    • on: names of levels of columns or indexes to join on. In both DataFrames, these columns must be present. The intersection of the columns in both DataFrames is used in the absence of a specific intersection.
    • left_on: Names of the left DataFrame columns or index levels to join on.
    • right_on: By default, it takes none value Names of the right DataFrame columns or index levels to join on.
    • left_index: as the join key, use the index from the left DataFrame.
    • right_index: By default, it takes a false value as the join key, use the index from the right DataFrame.
    • sort: By default, it takes a true value, and changing it to False will usually significantly improve performance.

Example:

Let’s take an example and check how to concatenate two Dataframes in Python using dataframe.merge().

Source Code:

import pandas as pd
left = pd.DataFrame({
    'id':[5,23,4,5,67],
   'emp_id':[782,785,542,908,156],
   'emp_name': ['John', 'Potter', 'Micheal', 'James', 'George']})
right = pd.DataFrame({
    'id':[5,23,4,5,67],
	  'emp_id':[856,434,290,167,894],
   'emp_name': ['Robert', 'Richard', 'Thomas', 'Daniel', 'Mark']})
print (pd.merge(left,right,on='id'))

Here is the Screenshot of the following given code

How to concatenate two Dataframes in Python using dataframe.merge_
How to concatenate two Dataframes in Python using dataframe.merge

This is how to concatenate two Dataframes in Python using a dataframe.merge().

You may also like to read the following Python tutorials.

In this article, we will discuss several methods to concatenate two Dataframes in Python. And also we have covered the following given topics.

  • How to concatenate two Dataframes in Python using concat()
  • How to concatenate two Dataframes in Python using dataframe.append()
  • How to concatenate two Dataframes in Python using dataframe.merge()