In this Python tutorial, I will explain what the Pandas iterrows in Python is, with its syntax and some example. Also, how iterrows add new column in Pandas dataframe.
To thoroughly understand Pandas iterrows in Python, this article presented two key examples: a basic usage scenario demonstrating row iteration, and an advanced case illustrating the addition of a new column to a DataFrame. These examples highlight iterrows() as an essential tool for tailored data manipulation and row-specific calculations in Pandas.
Pandas iterrows in Python
The Pandas iterrows in Python is a generator that yields pairs of index and row data in a DataFrame. Each row is returned as a Pandas Series object. This provides a way to iterate over DataFrame rows as (index, Series) pairs.
The typical syntax of the iterrows() function in Pandas is:
for index, row in dataframe.iterrows():
# operations using 'row'
The Pandas iterrows() function in Python is particularly useful when we need to perform operations on individual rows.
For example, we might want to calculate a value based on the data in each row or conditionally update row values.
import pandas as pd
data = {
'StoreID': [101, 102, 103, 104],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston'],
'State': ['NY', 'CA', 'IL', 'TX'],
'Sales': [12000, 18500, 15000, 16500],
'Date': ['2023-12-01', '2023-12-01', '2023-12-01', '2023-12-01']
}
df = pd.DataFrame(data)
for index, row in df.iterrows():
if row['Sales'] > 15000:
print(f"Congratulations to store {row['StoreID']} in {row['City']} for outstanding sales performance!")
Output:
Congratulations to store 102 in Los Angeles for outstanding sales performance!
Congratulations to store 104 in Houston for outstanding sales performance!
The screenshot provided below illustrates the results post-execution of the code in the Pycharm editor.
iterrows add new column in Pandas dataframe
Pandas iterrows() function in Python can be used to add a new column to a DataFrame by iterating over each row and performing calculations or applying logic based on existing column values.
During iteration, we can assign a new value to a new column for each row, effectively extending the DataFrame with additional information.
For instance:
import pandas as pd
data = {
'StoreID': [101, 102, 103, 104],
'Sales': [12000, 18500, 15000, 16500]
}
df = pd.DataFrame(data)
tax_rate = 0.08
df['SalesTax'] = 0.0
for index, row in df.iterrows():
sales_tax = row['Sales'] * tax_rate
df.at[index, 'SalesTax'] = sales_tax
print(df)
Output:
StoreID Sales SalesTax
0 101 12000 960.0
1 102 18500 1480.0
2 103 15000 1200.0
3 104 16500 1320.0
Following the execution of the code within the Pycharm editor, the subsequent screenshot displays the output.
Advantages of Using iterrows in Python
The iterrows() function in Pandas allows for specific, row-wise logic that might be more complex with vectorized operations.
Limitations of pandas iterrows function in Python
Scalability: If the dataset were significantly larger, the Python iterrows() function could be inefficient in terms of execution time.
Resource Intensive: For large datasets, using Pandas iterrows in Python could lead to higher memory usage.
Conclusion
Here, I have explained the versatile functionality of Pandas iterrows in Python through two examples: a basic illustration of iterating over DataFrame rows and a practical demonstration of adding a new column to a DataFrame. And also how iterrows() function can help to add new column in Pandas dataframe.
These examples showcase iterrows() as a powerful tool for custom row-wise operations, emphasizing its adaptability in handling diverse data manipulation tasks in Pandas.
You may also like to read:
- Count duplicates in Pandas dataframe in Python
- Pandas Replace Multiple Values in Column based on Condition in Python
- Convert Python Dictionary to Pandas DataFrame
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.