Python Generator vs Iterator Performance [Complete guide]

Python generators and iterators are used to iterate over a collection of items. Although they both serve the same purpose, they work differently and have different performance implications. In this tutorial, we’ll go through what generators and iterators are, how to create them, and analyze their performance.

Python generator vs iterator

Iterators

  • An iterator is an object that implements the iterator protocol, which consists of the methods iter() and next().
  • Iterators allow lazy evaluation, which means the next value in the sequence is computed on demand.
  • They are used for traversing through containers like lists, tuples, etc.

Generators

  • Generators are a simple way of creating iterators.
  • They also allow lazy evaluation but use a function with the yield statement instead of implementing the iter() and next() methods.
  • Generators automatically maintain the state between successive calls.

Python Generators

How to Create a Generator

You can create a generator by defining a function and using the yield keyword in Python.

def count_up_to(max):
    count = 1
    while count <= max:
        yield count
        count += 1

# Using the generator
for number in count_up_to(5):
    print(number)

You can see the output like below:

python generator vs iterator performance

Performance Analysis

Python Generators are memory efficient. They do not store all of the values in memory; they generate each value on-the-fly.

import sys

# List comprehension uses a lot of memory
numbers_list = [x for x in range(1000000)]
print("Size of the list:", sys.getsizeof(numbers_list), "bytes")

# Generator expression uses less memory
numbers_generator = (x for x in range(1000000))
print("Size of the generator:", sys.getsizeof(numbers_generator), "bytes")

Python Iterators

How to Create an Iterator

To create an iterator, you need to implement the iter and next methods in Python.

class CountUpTo:
    def __init__(self, max):
        self.max = max
        self.count = 0
        
    def __iter__(self):
        return self
        
    def __next__(self):
        if self.count > self.max:
            raise StopIteration
        self.count += 1
        return self.count - 1

# Using the iterator
counter = CountUpTo(5)
for number in counter:
    print(number)

Performance Analysis

Python Iterators also don’t store all the values in memory but generate them on-the-fly. However, the creation of an iterator requires defining a class, which can be more verbose than generators.

import sys

# Iterator
class RangeSquared:
    def __init__(self, n):
        self.n = n
        self.i = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.i >= self.n:
            raise StopIteration
        else:
            result = self.i ** 2
            self.i += 1
            return result

# Memory consumption
numbers_iterator = RangeSquared(1000000)
print("Size of the iterator:", sys.getsizeof(numbers_iterator), "bytes")

Generators vs Iterators: Differences

AttributeGeneratorIterator
CreationFunction with yield keywordClass with __iter__ & __next__
Memory ConsumptionLow (generates on-the-fly)Low (generates on-the-fly)
Code VerbosityLess verboseMore verbose
Use of StateAutomatic state saving between yieldsMust manage state manually
Use of Lazy EvaluationYesYes
Use CaseSimple and memory-efficient iterationCustomized iteration control

Conclusion

When you need to iterate through a sequence of data without loading all the data into memory, both generators and iterators are valuable. Generators are simpler to write and should be your choice for most use cases. However, if you need more control over the iteration process, implementing a custom iterator could be the better option.

I hope you got an idea of Python generator vs iterator performance.

You may also like: