How to set first column as index in Pandas Python [5 Methods]

In this Python article, I will explain how to set first column as index in Pandas Python using different methods with some demonstrative examples.

To set the first column as an index in Pandas, several methods can be used: the set_index() method for explicitly setting a column as the index, the read_csv() function with index_col parameter for setting the index while loading data, using iloc[] to select and assign the first column as the index, applying in-place modification with inplace=True in set_index(), and combining set_index() with rename_axis() for additional control over index naming.

Set first column as index in Pandas Python

There are five methods to set first column as index in Pandas Python:

  1. Using set_index() Method
  2. Using read_csv() Parameters
  3. Using iloc[]
  4. In-Place Modification
  5. Using rename_axis()

Let’s see them one by one using some illustrative examples:

1. Pandas use first column as index using the set_index() method

This method involves explicitly setting a DataFrame column as the index. We pass the name or position of the column to the set_index() method of the DataFrame in Python, which replaces the current index with the specified column.

READ:  How to use Python Scipy Differential Evolution

Here is the code, to set first column as index in Pandas Python using the set_index() method:

import pandas as pd

data = {'State': ['California', 'Texas', 'New York'],
        'Population': [39512223, 28995881, 19453561],
        'GDP': [3000000, 1800000, 1500000]}
df = pd.DataFrame(data)
df = df.set_index('State')
print(df)

Output:

            Population      GDP
State                          
California    39512223  3000000
Texas         28995881  1800000
New York      19453561  1500000

Displayed below is a screenshot capturing the output after the code’s implementation in the Pycharm editor.

How to set first column as index in Pandas Python

2. How to make first column as index in Pandas using read_csv() parameters

When loading data from a CSV file using read_csv(), we can directly set first column as index in Pandas Python using the index_col parameter. This is a convenient way to define the index while importing the data.

import pandas as pd

df = pd.read_csv('C:/Users/kumar/OneDrive/Desktop/book.csv', index_col=0)
print(df)

Output:

             Population    Area
City                           
New York      84,19,000   789.0
Los Angeles   39,71,000  1214.0
Chicago       27,16,000   606.0
Houston       23,25,500  1651.0
Phoenix       16,60,000  1340.0
Philadelphia  15,84,000   347.0
San Antonio   15,47,000  1194.0
San Diego     14,24,000   842.0
Dallas        13,43,000   882.0
San Jose      10,27,000   460.0
NaN                 NaN     NaN

Post-execution of the code in Pycharm, the output is captured in the screenshot presented below.

pandas first column as index in Python
dataframe first column as index in Python

3. Set first column as index Pandas using iloc()

This approach involves selecting the first column using iloc[] and then assigning it as the Python Pandas DataFrame’s index. It’s a two-step process where we extract the column and then set it as the index.

import pandas as pd

data = {'University': ['MIT', 'Stanford', 'Harvard'],
        'Enrollment': [11000, 17000, 21000],
        'Location': ['Massachusetts', 'California', 'Massachusetts']}
df = pd.DataFrame(data)
df.index = df.iloc[:, 0]
df.drop('University', axis=1, inplace=True)
print(df)

Output:

            Enrollment       Location
University                           
MIT              11000  Massachusetts
Stanford         17000     California
Harvard          21000  Massachusetts

The screenshot below illustrates the output after the code was implemented in the Pycharm editor.

first column as index pandas in Python

4. Pandas set first column as index by in-place modification

In this method, we can modify the DataFrame directly without creating a new DataFrame object in Python. By setting the inplace=True parameter in the set_index() method in Pandas, the index change is applied directly to the existing DataFrame.

READ:  Matplotlib log log plot

Here is the code, to set first column as index in Pandas Python using the inplace=True parameter in the set_index() method:

import pandas as pd

data = {'Park Name': ['Yellowstone', 'Yosemite', 'Grand Canyon'],
        'State': ['Wyoming', 'California', 'Arizona'],
        'Area': [2219791, 761747, 1217403]}
df = pd.DataFrame(data)
df.set_index('Park Name', inplace=True)
print(df)

Output:

                   State     Area
Park Name                        
Yellowstone      Wyoming  2219791
Yosemite      California   761747
Grand Canyon     Arizona  1217403

Following the execution of the code in the Pycharm editor, the screenshot provided below displays the output.

pandas set index to first column in Python

5. How to set first column as index in Pandas using rename_axis()

This method combines rename_axis() with set_index() for additional control, especially when we need to rename the index. It’s useful when we want to set the index and also provide a specific name to the index column.

This is the code to set first column as index in Pandas Python using rename_axis() with set_index() method:

import pandas as pd

data = {'Landmark': ['Statue of Liberty', 'Mount Rushmore', 'Golden Gate Bridge'],
        'Year Established': [1886, 1925, 1937],
        'Location': ['New York', 'South Dakota', 'California']}
df = pd.DataFrame(data)
df = df.set_index('Landmark').rename_axis('Historical Landmark')
print(df)

Output:

                     Year Established      Location
Historical Landmark                                
Statue of Liberty                1886      New York
Mount Rushmore                   1925  South Dakota
Golden Gate Bridge               1937    California

After the execution of the code in the Pycharm editor, the screenshot below captures the resulting output.

pandas make first column index in Python

Conclusion

Here, I have explained five different methods to set first column as index in Pandas Python, including the use of the set_index() method, employing read_csv() parameters, utilizing iloc[], applying in-place modification, and leveraging rename_axis() methods.

These methods offer a range of options to efficiently manipulate indices in Pandas DataFrames according to specific data handling needs.

READ:  Pandas iterrows in Python [2 Examples]

You may also like to read: