Pandas in Python is a library that deals with datasets or dataframes. One of the everyday tasks we generally perform in dataframes is to access index values. The index in a Pandas dataframe provides a way to identify each row.
In this Python Blog, I will tell you about various methods to get index values from dataframes in Pandas Python. We generally use the df.index property to get the index values from dataframes in Pandas.
But we can also use the following methods to get index value in dataframe in Python Pandas:
- Using for loop
- Use index.values attribute
- Use get_level_values() method
- Use get_loc() function
- Using np.where() function
Let’s see them all one by one with some illustrative examples:
1. Pandas get index values in Python using df.index property
We can use the df.index property to get the index from the Pandas dataframe. By default, the df.index property in Python Pandas returns the type of Index in a range.
Let’s see an example to get index values in Pandas dataframes using the .index property:
import pandas as pd
Product_sales = {'State': ['New York', 'California', 'Texas', 'Florida'],
'Sales': [50000, 75000, 60000, 45000]}
sales_df = pd.DataFrame(Product_sales)
states_index = sales_df.index
print(states_index)
Output:
RangeIndex(start=0, stop=4, step=1)
Below is a screenshot that reveals the output after the code has been implemented in the Pycharm editor.
2. Python dataframe to get index value using for loop
Since we get the RangeIndex() from the df.index property in Python to get the index values from the dataframe in Pandas.
We can use the for loop to iterate over the indexes of a given DataFrame in Python.
Here is an instance of a Python dataframe to get index value using for loop:
import pandas as pd
data = {'City': ['New York', 'Los Angeles', 'Chicago', 'Houston'],
'Population': [8500000, 4000000, 2700000, 2300000]}
df_population = pd.DataFrame(data)
df_population = df_population.set_index('City')
for city_index in df_population.index:
print(city_index)
Output: Here, we have the set_index() method to set the ‘city’ as an index to the df_population dataframe in Python.
df_population.set_index('City')
Below is the output of the code when we run the for loop:
New York
Los Angeles
Chicago
Houston
Upon running the code in Pycharm, the resulting output is displayed in the screenshot below.
3. Get index values from dataframes in Pandas Python using index.values
Pandas index.values attribute will return an array of the data in the given index object.
This is how we can use the index.values attribute in Python to get index values from dataframes in Pandas Python:
import pandas as pd
Covid_cases = {'Cases': [10000, 15000, 8000, 12000]}
covid_cases_df = pd.DataFrame(Covid_cases, index=['New York', 'California', 'Texas', 'Florida'])
state_names = covid_cases_df.index.values
print(state_names)
Output: We have set the index of the given dataframe using the index parameter in the pd.DataFrame() function.
pd.DataFrame(Covid_cases, index=['New York', 'California', 'Texas', 'Florida'])
Here is the full output that we will get after the execution of the code in the Pycharm editor.
['New York' 'California' 'Texas' 'Florida']
4. Get indexes of dataframe Pandas in Python using the get_level_values() method
We can use the index.get_level_values() method to get the index values in Pandas dataframes, as it returns an Index of values for the requested level in Python.
For example:
import pandas as pd
Employee = {'Names': ['Amy', 'Joey', 'Lucifer', 'Claus'], 'Salary': [10000, 15000, 8000, 12000]}
Employee_salary_df = pd.DataFrame(Employee, index=['New York', 'California', 'Texas', 'Florida'])
state_names = Employee_salary_df.index.get_level_values(0)
print(state_names)
Output:
Index(['New York', 'California', 'Texas', 'Florida'], dtype='object')
Below is a screenshot that captures the outcome after implementing the Pycharm editor’s code.
5. Pandas get index from dataframes using the get_loc() function
We can also get the index values of the DataFrame in the Python Pandas column using the get_loc() function. We must pass the column label to get its index to the get_loc() function. It will return the index location in Python.
Here is an instance to get index values Pandas using the get_loc() function:
import pandas as pd
Sales_stats = {'Rate (%)': [85, 80, 90, 75]}
df_sales_stats = pd.DataFrame(Sales_stats, index=['New York', 'California', 'Texas', 'Florida'])
index_location = df_sales_stats.index.get_loc('Texas')
print(index_location)
Output:
2
The following screenshot illustrates the result after executing the code in the Pycharm editor.
6. Python dataframe to get the index value in Pandas using the np.where() function
We can also get the index by specifying a condition passed into np.where() function from the NumPy library in Python. The np.where() function returns the indices of elements in an input array where the given condition is satisfied.
Here is the code to use the np.where() function Pandas to get index by value in Python:
import pandas as pd
import numpy as np
data = {'State': ['New York', 'California', 'Texas', 'Florida'],
'Population': [20000000, 15000000, 25000000, 18000000]}
population_data = pd.DataFrame(data)
indices_greater_than_threshold = np.where(population_data['Population'] > 15000000)[0]
print(indices_greater_than_threshold)
Output: Here is the array of all the indices satisfying the condition. i.e., “population_data[‘Population’] > 15000000“.
[0 2 3]
The screenshot below features the output after executing the Pycharm editor’s code.
Conclusion
Mastering how to get index values from dataframes in Pandas Python using six different methods, like df.index property, for loop, index.values attribute, get_level_values() method, get_loc() function and np.where() function.
You can prefer the methods according to your code requirement in Python. Hopefully, I have explained all the methods better. I tried to illustrate each one with suitable examples for better understanding.
You may also like to read:
- How to check if a dataframe is empty in Python
- How to impute missing values in Pandas Python
- How to drop header row of Pandas DataFrame
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.