Find Index of Value in Pandas Python

I was working on a data analysis project where I needed to locate specific values in a large dataset. The challenge was finding the exact position of these values in the Pandas DataFrame efficiently.

Finding the index of a value in Pandas is a common task that can be approached in several ways, depending on your specific requirements.

In this guide, I will share four proven methods to find the index of a value in Pandas, along with practical examples.

1- Use Boolean Indexing to Find an Index

Boolean indexing is one of the simplest ways to find the index of a value in a Pandas DataFrame or Series.

Let me show you how this works with a simple example:

import pandas as pd

# Creating a sample Series
sales_data = pd.Series([5000, 7500, 6200, 8100, 7500, 9300], 
                       index=['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'])

# Finding the index where value is 7500
result = sales_data[sales_data == 7500].index.tolist()
print(f"The value 7500 appears at indices: {result}")

In this example, we’re finding all occurrences of the value 7500 in our monthly sales data. The output would show:

The value 7500 appears at indices: ['Feb', 'May']

I executed the above example code and added the screenshot below.

pandas get index

This method works by creating a boolean mask (True/False values) where the condition is met, then accessing the index of those True values.

For DataFrames, you can specify the column you want to search in:

# Creating a sample DataFrame
employee_data = pd.DataFrame({
    'Name': ['John Smith', 'Maria Garcia', 'Robert Johnson', 'Sarah Williams', 'David Brown'],
    'Department': ['Sales', 'Marketing', 'Sales', 'HR', 'Finance'],
    'Salary': [75000, 82000, 75000, 68000, 90000]
})

# Finding indices where salary is 75000
salary_indices = employee_data.index[employee_data['Salary'] == 75000].tolist()
print(f"Employees with salary 75000 are at indices: {salary_indices}")

This would output:

Employees with salary 75000 are at indices: [0, 2]

2- Use .get_loc() Method for Exact Matches

Python .get_loc() method is perfect when you’re working with index objects and need to find the position of a specific value.

This method is particularly useful when you know the value exists exactly once in the index:

import pandas as pd

# Creating a DataFrame with states as index
state_data = pd.DataFrame({
    'Population': [39538223, 29145505, 21538187, 20201249, 13002700],
    'Area (sq mi)': [163696, 268596, 65758, 54555, 57914]
}, index=['California', 'Texas', 'Florida', 'New York', 'Illinois'])

# Finding the index position of 'Florida'
try:
    florida_index = state_data.index.get_loc('Florida')
    print(f"Florida is at index position: {florida_index}")
except KeyError:
    print("Florida not found in the index")

The output would be:

Florida is at index position: 2

I executed the above example code and added the screenshot below.

pandas get index of row

This method is faster than Boolean indexing when you’re searching the index itself, but it will raise a KeyError if the value isn’t found, so it’s good practice to use a try-except block.

3- Use .idxmax() for Finding First Occurrence

When you need to find the index of the first occurrence of a value, the combination of boolean masking with .idxmax() provides an elegant solution in Python:

import pandas as pd

# Creating a sample DataFrame of customer purchases
purchases = pd.DataFrame({
    'Customer': ['Alice', 'Bob', 'Charlie', 'David', 'Emma', 'Frank'],
    'Product': ['Laptop', 'Phone', 'Tablet', 'Phone', 'Laptop', 'Phone'],
    'Amount': [1200, 800, 500, 850, 1100, 780]
})

# Finding the first occurrence of 'Phone' in the Product column
first_phone = (purchases['Product'] == 'Phone').idxmax()
print(f"First 'Phone' purchase is at index: {first_phone}")

This would output:

First 'Phone' purchase is at index: 1

I executed the above example code and added the screenshot below.

pandas find index of value

The .idxmax() method returns the index of the first True value in a boolean Series, making it perfect for finding the first match.

For finding all occurrences, we would still use the Boolean indexing approach from Method 1.

Check out Count Duplicates in Pandas dataframe in Python

4- Use .index Property with Conditions

Another approach that provides flexibility when searching for complex conditions is using the .index property directly with conditions:

import pandas as pd
import numpy as np

# Sample DataFrame of stock prices
stock_data = pd.DataFrame({
    'Date': pd.date_range(start='2023-01-01', periods=10, freq='B'),
    'Price': [152.5, 153.8, 151.2, 150.7, 155.2, 157.3, 158.1, 156.9, 159.2, 160.5],
    'Volume': [5000000, 6200000, 4800000, 5500000, 7300000, 6800000, 5900000, 6100000, 7500000, 8200000]
})

# Find indices where price is above 155 and volume is above 6000000
high_activity_indices = stock_data.index[
    (stock_data['Price'] > 155) & 
    (stock_data['Volume'] > 6000000)
].tolist()

print(f"High activity trading days at indices: {high_activity_indices}")

Output:

High activity trading days at indices: [5, 8, 9]

This method is particularly useful when you need to find indices based on multiple conditions across different columns.

You can get the index of rows in a Pandas DataFrame using similar techniques, allowing you to pinpoint exactly which data points meet your criteria.

Find Index in MultiIndex DataFrames

Working with MultiIndex (hierarchical) DataFrames requires a slightly different approach:

import pandas as pd

# Creating a MultiIndex DataFrame
multi_idx = pd.MultiIndex.from_tuples([
    ('New York', 'Manhattan'), 
    ('New York', 'Brooklyn'), 
    ('Los Angeles', 'Downtown'), 
    ('Chicago', 'Loop'),
    ('Chicago', 'Hyde Park')
], names=['City', 'District'])

property_values = pd.DataFrame({
    'Avg_Price': [1200000, 900000, 850000, 750000, 600000],
    'Sq_Ft': [1000, 1200, 1500, 1100, 1300]
}, index=multi_idx)

# Finding index of specific location
ny_brooklyn_idx = property_values.index.get_loc(('New York', 'Brooklyn'))
print(f"New York, Brooklyn is at index position: {ny_brooklyn_idx}")

# Finding all properties in Chicago
chicago_indices = [i for i, idx in enumerate(property_values.index) if idx[0] == 'Chicago']
print(f"Chicago properties are at indices: {chicago_indices}")

Output:

New York, Brooklyn is at index position: 1
Chicago properties are at indices: [3, 4]

When working with MultiIndex DataFrames, you can also use .xs() (cross-section) to select data at a specific level before finding indices.

When to Use Each Method

Here’s my recommendation on which method to use based on your specific scenario:

  1. Boolean Indexing: Best for finding all occurrences of a value, especially in larger DataFrames
  2. get_loc(): Ideal for finding a single value in the index, with the best performance
  3. idxmax() with boolean mask: Perfect for finding just the first occurrence of a value
  4. Index with conditions: Most flexible for complex filtering scenarios

Performance Considerations

When working with large datasets, performance becomes crucial. In my experience:

  • For small to medium DataFrames (up to ~100,000 rows), all methods perform reasonably well
  • For large DataFrames, .get_loc() is fastest when searching in the index
  • Boolean indexing has overhead for very large DataFrames, but is still the most versatile
  • Consider using NumPy’s np.where() for extremely large datasets, as it can offer better performance

I hope you found this guide helpful for finding index positions in Pandas.

The methods that I have explained in this tutorial are: use boolean indexing to find an index, .get_loc() method for exact matches, .idxmax() for finding the first occurrence, and .index property with conditions.

You may read:

51 Python Programs

51 PYTHON PROGRAMS PDF FREE

Download a FREE PDF (112 Pages) Containing 51 Useful Python Programs.

pyython developer roadmap

Aspiring to be a Python developer?

Download a FREE PDF on how to become a Python developer.

Let’s be friends

Be the first to know about sales and special discounts.