How to Add a Column to a DataFrame in Python Pandas

In this Python Pandas tutorial, we will learn how to add a column to a dataframe in Python Pandas. Here we will see a few examples related to add column dataframe in pandas.

  • Add a Column to a DataFrame in Python Pandas
  • Add a column to dataframe pandas with a default value
  • Add a column to dataframe pandas from the list
  • Add a column to dataframe pandas with an index
  • Add a column to the dataframe pandas ignore the index
  • Add a column to dataframe pandas based on the condition
  • Add a column to dataframe pandas from the numpy array
  • Add a column from another dataframe panda
  • Add a column at the beginning of the dataframe pandas
  • Add a Column to a DataFrame in Python Pandas
  • Add a Column to a DataFrame in Python With the Same Value
  • Add a Column Name to a DataFrame Pandas
  • Add a Column to a Pandas DataFrame Based on an if-else Condition
  • Add a Column to a DataFrame From Another DataFrame Pandas

All the dataset used is either self-created or downloaded from Kaggle. Also, we have covered these topics.

Add a Column to a DataFrame in Python Pandas

Python is a popular programming language developed by Dutch programmer Guido van Rossum. Pandas is a machine learning library that is used to read, clean, analyze and export the dataset.

Popular companies in the United States like Amazon, Tesla, Google, Microsoft, etc use machine learning with python to understand the data and create a product that makes this world a better place.

Want to learn how to add a column to a dataframe in python pandas? There are three popular ways of adding a column to a dataframe in python pandas: –

  1. Assignment operator
  2. assign()
  3. insert()

Now we know the ways to add a column to a dataframe in python pandas. Let’s explore each one of them in detail.

Add a column to a dataframe in python pandas using an Assignment operator

The easiest way to add a column to a dataframe in python pandas is by using the assignment operator. Mention the name of the dataframe followed by the new column name inside the brackets, equal to the operator, and then the value for the column.

If you are adding a column in a dataframe that already has some data then this information will save you from getting an error: –

  1. Set the default value for the new column so that occupied rows have the same values.
  2. While passing a list of values for the new column make sure it has the same number of rows as other data otherwise pandas will throw a value error.
ValueError: Length of values does not match length of index 

The image shows the right way of adding a new column. Gender is the new column here.

Add a column to a dataframe in python pandas using Assignment operator

Syntax: – The below syntax shows how to add single and multiple columns to the dataframe in python pandas.

# add single data
dataframe[new_column] = 'Value'

# add multiple data
dataframe[new_column0, new_column1, new_column2] = [val1, val2, val3]

Parameters description: –

  • dataframe – the pandas dataframe, could be any name.
  • new_column – the new column, could be any name
  • value – a value of any data type (integer, string, etc.)

Now we have understood how to add dataframe in python pandas using an assignment operator(=). Below I have implemented this knowledge in the form of an example.

Example: – In the below example, I am using a dataset of popular carwash companies in the USA. The company wants to add new columns to the dataframe. These new columns are: –

  • Gender
  • Street
  • State
  • Country
  • Zip code

The below image shows the current snap of the python pandas dataframe.

Add a column to a dataframe in python pandas using an Assignment operator, Add column names to a dataframe pandas
Carwash dataset of USA company

I have added a single column to the dataframe which is Gender and I have set the default value as ‘Male’. The alternative default values could be – missing values (nan), None, empty strings (‘ ‘), or the list of values to the length of the index.

# add a Gender column to the dataframe
carwash_df['Gender'] = 'Male'
Add a column to a dataframe in python pandas using equals to operator, Add column names to a dataframe pandas
Add a column to a dataframe in python

This time I have added multiple columns to the dataframe and assigned default values as follows:

  • Street – 7500 Sawmill Pkwy
  • State – Ohio
  • Country – United States
  • Zip code – 43065

Each index of these columns will have the same values, alternatively, I could provide empty values or a list of values to the same length of index.

# add Multiple columns to the dataframe
carwash_df[['Street', 'State', 'Country', 'Zip code']] = ['7500 Sawmill Pkwy', 'Ohio', 'United States', 43065]
Add a columns to a dataframe in python pandas using an Assignment operator

Add a column to a dataframe in python pandas using the assign() method

The assign() method in python pandas is used to create a new column with the modified values deriving from the existing column(s). It is useful when the requirement is to add a column from one dataframe to another panda.

Syntax: – Here is the syntax to add a column to a dataframe in python pandas using the assign() method.

dataframe.assign(**kwargs)

Here, **kwargs is the new dataframe name or new column name here and it can have n number of values.

Add a column to a dataframe in python pandas using the insert() method

The insert() method in python pandas allows adding columns to a dataframe at a specific index or position.

Syntax: – This syntax is for the insert() method in python pandas and is used to add to column to a dataframe in python pandas.

carwash_df.insert(
    loc: int val,
    column: column_name,
    value: single or list of values,
    allow_duplicates: = True/False,
)

Parameters description:

  • loc – accepts integer value, and defines the position of the column in the dataframe.
  • column – column name that you want to add
  • value – single value or list of values can be passed
  • allow_duplicates – if set to False, pandas won’t allow the creation of column if it already exists in the dataframe.

Add a column to dataframe pandas with a default value

The default value in the pandas dataframe enters a value in the record automatically. This is helpful, in the following scenarios:

  • You want to add a new column but the new column’s length of values is not enough to match the length of the index of existing dataframe
  • In the case of boolean entries in the column, setting True as the default value completes half of the work and vice-versa.
  • It is always to have a value instead of missing values. Default, value can ease the process of data cleaning.

I explained how to add columns in python pandas in the previous section. We will use that knowledge here to add a column to dataframe pandas with a default value.

Want to learn how to add a column to dataframe pandas with default values? Read the complete blog.

Example: – In this example, I have created a new column Country in the carwash dataframe and assigned a default value as ‘United States.

carwash_df['Country'] = 'United States'

All the rows will automatically be filled with the same United States in the carwash dataframe. The below image shows the output of how to add a column to the dataframe.

Add a column to dataframe pandas with a default value
Add a column to dataframe pandas with a default value

Default values are not limited to strings only, you can set an integer default value as well. In the below example, I have added columns to the pandas dataframe with latitudes and longitudes with default values.

Please note here, I have demonstrated adding multiple columns in the dataframe with the default value with the integer data type.

carwash_df[['Latitude', 'Longitude']] = ['40.194082', '-83.097631']

This example is not realistic as all the branches of the Carwash company are at different locations in the United States and abroad. But is appropriate to explain how to add a column in the pandas dataframe with a default value.

Add a column to dataframe pandas with a default
Add a column to dataframe pandas with a default

With this, I have explained how to add a column to dataframe pandas with a default value.

Add a column to dataframe pandas from the list

Pandas is the data analysis library that provides a wide variety of actions. In this section, we will show you how to add a column to dataframe pandas from the list.

A list in python is a collection of items, it could be homogenous or heterogeneous data of the same or different data types in python pandas.

Most of the answers on the internet related to add a column to dataframe pandas from the list, they have created a list of values and then passed that list with the new column name as shown below.

# list of values
USA_states_list = ['Alabama', 'Alaska', 'Arizona', 'Arkansas','Connecticut', 'Colorado']

# create new column with the name states
df['States'] = USA_states_list

This way, a new column – States is added with the values – Alabama, Alaska, Arizona, Arkansas, Connecticut, and Colorado. This might appear correct but doesn’t justifies the requirement.

My understanding of the statement – Add a column to dataframe pandas from the list says that I have to create multiple columns from the given list of column names.

Example: – Suppose I have a list of column names as given below and using that i have to create a dataframe in python pandas.

# List of column names
col_list = ['Company', 'State', 'Country', 'Zip code' ]

I will put the list of column names in the loop and then add a column to the dataframe in python pandas.

# create empty dataframe
company_loc = pd.DataFrame({})

# list of columns
col_list = ['Company', 'State', 'Country', 'Zip code' ]

# add columns from list
for i in col_list:
    company_loc[i]=None

# Result
company_loc
Add a column to dataframe pandas from the list
Add a column to dataframe pandas from the list

There are various efficient ways to add multiple columns to dataframe pandas like using assign(), insert(), etc., but since I am using a list, this is the way to do that.

Add a column to dataframe pandas with an index

Index in pandas describes the position of either row or column. By default, it starts with 0 and goes all the way to the last column or row. In pandas, axis =1 is referred to the columns and axis=0 is for rows.

In this section, I have explained how to add a column to dataframe pandas with an index. So there are two built-in functions using which we can do that.

  • Insert() method
  • Reindex() method

Add a column to dataframe pandas with an index using the insert() method

Python pandas provide an insert() method that is used to add a column to the specific position on the dataframe. This method is mostly used while creating a new column in the dataframe.

Syntax: – Below is the syntax to use the insert() method in python pandas.

dataframe.insert(
    loc: integer value,
    column: col name,
    value: value,
    allow_duplicates: False/True,
)

Parameter Description:

  • loc: Specific location where you want to add a new column, it accepts integer values.
  • column: Name the new column here
  • value: Provide the value in the column, the value could be a scaler or array.
  • allow_duplicate: Columns with the same name can be created if this parameter is set to True.

Example: –

I have added a new column Branch Code and positioned it at index 2 in the dataframe. For the values, I have created a loop that will keep on inserting the incremented values in each column.

In case I was unsure about the number of rows in the dataframe then I checked it using the shape attribute in python pandas.

# insert new column at index 2
carwash_df.insert(
    loc=2,
    column='Branch Code',
    value=[i for i in range(1000, 2000)]
)

# display result
carwash_df.head(3)

In the below image, a new column – Branch Code has been added to the dataframe at index 2. By default, new columns are added at the end of existing columns.

Add a column to dataframe pandas with an index using the insert method

Add a column to dataframe pandas with an index using reindex() method

Python pandas provide reindex() method so that existing indexes can be repositioned in the pandas dataframe. In the previous section, I created a new column – Branch Code. Here, I will shift its position next to the Branch Address i.e index=7.

Syntax: – Below is the syntax to implement reindex() method in python pandas. The syntax has more parameters but I have shown only the necessary ones for this task.

dataframe.reindex(columns=None)

Parameter Description:

  • Columns = Pass the list of columns with the changed position.

Example: –

In this example, I am going to reposition the value of the Branch Code column from index 2 to index 7 so that it appears next to the Branch address in the pandas dataframe.

Add a column to the dataframe pandas ignore the index

While preparing a dataframe for an assignment I have to concatenate data from multiple dataframes. Due to this the index value appears uneven as shown in the below image.

Add a column to the dataframe pandas ignore the index
Uneven index in the pandas dataframe

To resolve this issue, I have set the ignore_index parameter to True in the pandas dataframe. This parameter is available with various methods like concat, append, assign, etc.

Other than that, on using the assign() method in python pandas the indexes are ignored so this is another great option to add a column to the dataframe pandas ignore the index.

Example 1: In the below example, I have two dataframes and when they are combined together the indexes are uneven, so to fix that I have set the value for the ignore_index parameter as True in python pandas.

# sample dataframes
df1 = pd.DataFrame({
    'First_Name': ['Jarret', 'Rennie', 'Curtice'],
    'Last_Name': ['Nicoli', 'Scrigmour', 'Champ'], 
    'Phone': [9528557099, 3536026551, 9844245106],
    'Country': 'USA'
})

df2 = pd.DataFrame({
    'First_Name': ['Tabatha', 'Etienne', 'Kim'],
    'Last_Name': ['Pennock', 'Kohtler', 'Culter'], 
    'Phone': [8391082413, 9905355612, 1606864298],
    'Country': 'United Kingdom'
})

I have shown using concat Similarly, it can be done with other functions too.

pd.concat([df1, df2], ignore_index=True)

In the below output, the index is organized in a proper sequence after using ignore index parameter in python pandas.

Add a column to the dataframe python pandas ignore the index
Add a column to the dataframe python pandas ignore the index

Example 1: Here is another example using the assign() method to add a column to the dataframe pandas ignore the index.

#data
first_name = pd.Series(['Tabatha', 'Etienne', 'Kim', 'Bidget', 'Hannie', 'Esme'])
last_name = pd.Series(['Pennock', 'Kohtler', 'Culter', 'Stivens', 'Treslove', 'Eastbrook'])
country = pd.Series(['USA', 'USA', 'USA', 'USA', 'USA','USA'])

# add a column with assign()
df.assign(
    First_Name=first_name.values,
    Last_Name=last_name.values,
    Country=country.values
) 

The assign() method has overwritten the previous information and added a new column with the below-mentioned details.

Add a column to the dataframe ignore the index
Add a column to the dataframe and ignore the index

In this way, I have explained how to add a column to the dataframe and ignore the index in python pandas.

Add a column to dataframe pandas based on the condition

If you are wondering how to add a column to dataframe pandas based on the condition then read the entire blog because I have covered all the relevant points here.

The add a column to dataframe pandas based on the condition has two meanings here:-

  • Add a column based on a condition.
  • Add a column and fill the rows based on the condition.

Add a column based on a condition

Here, I will show how to create a column in a dataframe if it qualifies for some condition. There could be n number of conditions depending upon the user’s requirements, few of them are: –

  • Add a column – Full Name if the dataframe has a first name and last name columns
  • Add a column to calculate the average if the data type of more than 3 columns is int.

There could be many more scenarios but I will use the first scenarios in my example.

Example 1: the function will check if there are first names and last columns present in the dataframe. If yes, then a new column fullname will be created in the dataframe.

First_NameLast_NamePostal CodeCountry
0CurticeChamp99950USA
1TabathaPennock00501USA
2EtienneKohtler33601USA
3KimCulter10004USA
Python Pandas Dataframe
# pointer is increamented by 1 if condition is True
pointer = 0

if 'First_Name' in df.columns:
    pointer +=1

if 'Last_Name' in df.columns:
    pointer +=1


# if the value of pointer is 2 then full name column will be added
if pointer == 2:
    df['FullName'] = df.First_Name +' '+ df.Last_Name

In the below output, the full name column is added to the pandas dataframe. This new column has the user’s first and last names concatenated together in python pandas.

Add a column based on a condition in pandas
Add a column based on a condition in pandas

Add a column and fill the rows based on the condition

In this section, I have explained how to add a column and fill the rows based on condition. I will set a condition and if it qualifies then a new column will be added with the values based on that condition.

This may sound similar to the previous one but there is focus was only on creating a column here the focus is on the values of the column.

Example: – In this example, I will create a dataframe of crops and the temperature required for them. The temperature further is categorized as Hot, Moderate, or Low based on the condition.

TemperatureCrop
028.300415mungbean
126.736908watermelon
224.443455rice
324.247796pomegranate
426.335449banana
536.750875pigeon peas
641.656030grapes
718.283622maize
818.782263kidney beans
The temperature required to grow crops

Add a column to dataframe pandas from the numpy array

Numpy is a python library used for working with arrays. The array created using NumPy is faster and more efficient than the ones created in python. In this section, we will learn how to add a column to dataframe pandas from the NumPy array.

In the below dataset, we are going to add a new column – postal code using the NumPy array in python.

Add a column to dataframe pandas from the numpy array
Add a column to dataframe pandas from the numpy array

There are multiple ways to add a column to dataframe pandas from the numpy array, one of them is demonstrated below in python.

# array of data
postal_code = np.array([99950, 33601, 10004, 97290, 96898, 20108])

# add a new column
df['Postal_Code'] = postal_code 

# print dataframe
df
Add a column to dataframe pandas from the np array
Add a column to dataframe pandas from the np array

In this section, we have learned how to add a column to dataframe pandas from the numpy array.

Add a column from another dataframe pandas

While working with the pandas dataframe, I create multiple dataframes in the dataset. Some of them have purpose others are the copy a dataset for the experiment. Adding a column from one dataframe to other in pandas is a common activity.

Do you want to learn how to add a column from one dataframe to other in pandas? Read the complete blog.

In my example, I have df1 and df2 out of which df1 is the primary dataset and I will add isEscalated column from df2 to df1 in python pandas.

Add a column from another dataframe panda
Add a column from another dataframe panda
# using insert
df1.insert(loc=1, column='IsEscalated', value=df2['isEscalated'])
Add column from one dataframe to other pandas, Add a column based on another dataframe panda
Add column from one dataframe to other pandas
# using join
df1.join(df2['isEscalated'])
Add a column from another dataframe pandas, Add a column based on another dataframe panda
Add a column from another dataframe panda

Add a column at the beginning of the dataframe pandas

Python pandas provide an insert() method using which columns can be added at a specific place in the pandas dataframe. I have explained insert at various places in this blog.

In the below example, I have added a new column with the name serial in the python pandas. Providing loc=0 will add the new column at the beginning of the dataframe in python pandas.

df1.insert(loc=0, column='Serial', value=df1.index)
Add a column at the beginning of the dataframe pandas
Add a column at the beginning of the dataframe pandas

Add a Column to a DataFrame in Python Pandas

In this section, we will learn how to add a column to a dataframe in Python Pandas.

  • While working with the dataset in Python Pandas creation and deletion of columns is an active process. New columns with new data are added and columns that are not required are removed.
  • Columns can be added in three ways in an existing dataframe.
    • dataframe.assign()
    • dataframe.insert()
    • dataframe[‘new_column’] = value
  • In the dataframe.assign() method we have to pass the name of the new column and its value(s). If only one value is provided then it will be assigned to the entire dataset if a list of values is provided then it will be assigned accordingly.
  • In the dataframe.insert() method, a user has to provide the location or position where to insert the column, column name, value(s) for the column, and boolean value for if duplicates are allowed or not.
  • The third option is self-explanatory also you can read the comments in the Jupyter notebook to understand every step.
  • Please note if you are providing a list of values then total values must be equal to the number of rows.

Read: Convert Pandas DataFrame to NumPy Array

Add a Column to a DataFrame in Python With the Same Value

In this section, we will learn how to add a column to a dataframe in Python with the same value.

  • In a dataset, at times Engineer has to set the same value for a particular column. For Example, if the dataset is related to women only then the Gender column could have a female value only.
  • In our previous section, we learned how to add a column to a dataframe in pandas.
  • So to provide same value simply provide one value without a list. In this way, the same value will be assigned to all the rows of the column.
  • In our example, we have added a new column with the name ‘Rating’ and we have assigned 5 to all the rows.

Read: Add row to Dataframe Python Pandas

Add a Column Name to a DataFrame Pandas

In this section, we will learn how to add column names to a dataframe pandas.

  • While working with the dataset in Python Pandas creation and deletion of columns is an active process. New columns with new data are added and columns that are not required are removed.
  • Columns can be added in three ways in an existing dataframe.
    • dataframe.assign()
    • dataframe.insert()
    • dataframe[‘new_column’] = value
  • In the dataframe.assign() method we have to pass the name of the new column and its value(s). If only one value is provided then it will be assigned to the entire dataset if the list of values is provided then it will be assigned accordingly.
  • In the dataframe.insert() method, a user has to provide the location or position where to insert the column, column name, value(s) for the column, and boolean value for if duplicates are allowed or not.
  • the third option is self-explanatory also you can read the comments in the Jupyter notebook to understand every step.
  • Please note if you are providing a list of values then total values must be equal to the number of rows.

Read: Python Pandas replace multiple values

Add An Empty Column to a DataFrame in Pandas

In this section, we will learn how to add an empty column to a dataframe in Python Pandas.

  • Empty data can also be considered missing data or NaN values.
  • Using np.nan option from Numpy we can add a column with an empty value.
  • In our example, you can notice that we have added a new column with the name Rating and it has missing values or NaN values.
  • Here is the implementation on Jupyter Notebook.

Read: How to Set Column as Index in Python Pandas

Add a Column to a Pandas DataFrame Based on an if-else Condition

In this section, we will learn how to add a column to a pandas dataframe based on an if-else condition.

  • If-else condition is used to create a ladder of statements.
  • While working with the datasets, engineers have to put a condition to filter or clean the data based upon some condition.
  • For example, dividing the dataset into two parts based on gender.
  • In our example, we will create a new column with the name state. If the value of peak_player is less than a certain amount a new column will be created with the state set to 1 otherwise it will be set to 0.
  • Here is the implementation on Jupyter Notebook.

Read: Get index Pandas Python

Add a Column to a DataFrame From Another DataFrame Pandas

In this section, we will learn how to add a column to a dataframe from another dataframe in Python Pandas.

  • To demonstrate this we have created two dataframes in our example on Jupyter Notebook.
  • using a code, df1[‘Address’] = df2[‘City’] we have added a new column in the first dataframe with the name address and we have put the city name from the second dataframe to the first dataframe.
  • Here is the implementation on Jupyter Notebook.

Also, read the related articles.

In this tutorial, we have learned how to add a Column to a DataFrame in Python Pandas. Also, we have covered these topics.

  • Add a Column to a DataFrame in Python Pandas
  • How to add a column to dataframe pandas with a default value
  • Add a column to dataframe pandas from the list
  • How to add a column to dataframe pandas with an index
  • Add a column to the dataframe pandas ignore the index
  • How to add a column to dataframe pandas based on the condition
  • Add a column to dataframe pandas from the numpy array
  • Add a column from another dataframe panda
  • How to add a column at the beginning of the dataframe pandas
  • Add a Column to a DataFrame in Python Pandas
  • Add a Column to a DataFrame in Python With the Same Value
  • Add a Column to a Pandas DataFrame Based on an if-else Condition
  • Add a Column to a DataFrame From another DataFrame Pandas