When builing up an iterator in Python, we have to do a lot of work like implementing a class with __iter__()
and __next__()
methods in a class, keep track of the internal states, and raise StopIteration
error whenever there are no values to be returned. This is both lengthy and counterintuitive. Generators comes to the rescue in such situations. Python generators are a simple way of creating iterators. Generators handles all the work implicitly mentioned above while creating iterators. Simple speaking, a generator is a function that returns an object (iterator) which we can iterate over (one value at a time).
Create Generators In Python
Making a generator in Python is not too difficult. It is just as simple as defining a nomral function, but with a yield
statement instead of a return
statement as you would in a normal function. A function becomes a generator function if it has atleast one yield
statement (it may also include other yield or return statements). A function will both yield and return some value. In contrast to a return statment, which completely terminates a function, a yield statment only stops the function while saving all of its state and continues from there on subsequent calls.
Differences between a Generator Function and a Normal Function
Here, is how a generator functino differs from a normal function.
- Generator function contains one or more
yield
statments. - When called, it returns an object (iterator) but does not start execution immediately.
- Methods like
__iter__()
and__next__()
are implemented automaticaly. So we can iterate through the items usingnext()
. - Once the function yields, the function is paused and the control is transferred to the caller.
- Local variables and their states are rememered between successive calls.
- Finally, when the function terminates,
StopIteration
is raised automatically on further calls.
Example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
def my_gen():
n = 1
print('This is printed first. ')
# Generator function contains yield statements
yield n
n+=1
print('This is printed second.')
yield n
n+=1
print('This is printed at last')
yield n
# It returns on object bud does not start execution immediately.
a = my_gen()
# We can iterate through the items using next().
print(next(a))
Output:
This is printed first
1
print(next(a))
Output:
This is printed second
2
print(next(a))
Output:
This is printed at last
3
#Finally, when the function terminates, StopIteration is raised automatically
print(next(a))
Output:
Traceback (most recent call last):
...
StopIteration
>>> next(a)
Traceback (most recent call last):
...
StopIteration
In the example above, it’s interesting to note that variable n’s value is retained between calls. In contrast to regular functions, when the function yields, the local variables are preserved. The generator object can also only be iterated once.
We can also use generators with for loops directly. This is because a for
loop takes an iterator and iterates over it using next()
function. It automatically ends when StopIteration
is raised.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# A simple generator function
def my_gen():
n = 1
print('This is printed first')
# Generator function contains yield statements
yield n
n += 1
print('This is printed second')
yield n
n += 1
print('This is printed at last')
yield n
# Using for loop
for item in my_gen():
print(item)
Output:
This is printed first
1
This is printed second
2
This is printed at last
3
Python Generators with a Loop
Normally, generators functions are implemented with a loop having a suitbale termination condition. Example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def rev_string(my_str):
length = len(my_str)
for i in range(length-1,-1,-1):
yield my_str[i]
# For loop to reverse the string
for char in rev_string('hello'):
print(char)
Output:
o
l
l
e
h
Python Generator Expression
Generator expressions make it simple to design simple generators instantly. It facilitates the creation of generators. Similar to the lambda functions which create anonymous functions, generator expressions create anonymous generator functions.
In Python, a list comprehension has a syntax that is comparable to that of a generator expression. However, round parenthesis are used in place of the square brackets.
A list comprehension creates the full list, but a generator expression only produces one item at a time. This is the main distinction between the two types of expressions.
They have lazy execution ( producing items only when asked for ). Because of this, a generator expression uses substantially less memory than a comparable list comprehension.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#Initialize the list
my_list = [1,3,6,10]
#square each term using list comprehension
list_ = [x**2 for x in my_list]
# same thing can be done using a generator expression but generator expressions are surrounded by parenthesis()
generator = (x**2 for x in my_list)
print(list_)
Output: [1,9,36,100]
print(generator)
Output: <generator object <genexpr> at 0x7f5d4eb4bf50>
We can see above that the generator expression did not produce the required result immediately. Instead, it returned a generator object, which produces items only on demand.
We can get items from an generator object like:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Initialize the list
my_list = [1,3,6,10]
a = (x**2 for x in my_list)
print(next(a))
print(next(a))
print(next(a))
print(next(a))
next(a)
Output:
1
9
36
100
Traceback (most recent call last):
File "<string>", line 15, in <module>
StopIteration
Why to use Python Generators?
There are several reasons that make generators a powerful implementation.
- Easy to Implement: Generators can be created easily compared to iterators. Example:
1 2 3 4 5
def PowTwoGen(max=0): n = 0 while n < max: yield 2**n n+=1
Memory Efficient: In a typical function, the full sequence is created in memory before the result is returned. If the sequence has a lot of items, this would be overkill. Since it only produces one item at a time, generator implementation of such sequences is advantageous and memory friendly.
Represent Infinite Stream: Generators are excellent mediums to represent an infinite stream of data. Infinite streams cannot be stored in memory, and since generators produce only one item at a time, they can represent an infinite stream of data.
- Pipelining Generators Multiple generators can be used to pipeline a series of operations. Example: Let’s say we have a generator that generates the Fibonacci sequence of numbers. For squaring numbers, we also have another generator. By connecting the output of different generator functions, we may determine the sum of squares of all the numbers in the Fibonacci series in the manner shown below.
1 2 3 4 5 6 7 8 9 10 11 12 13
def fibonacci_numbers(nums): x, y = 0, 1 for _ in range(nums): x, y = y, x+y yield x def square(nums): for num in nums: yield num**2 print(sum(square(fibonacci_numbers(10)))) Output: 4895