Generators in python are cool. They help developers to deal with long lists, increase the performance of the calculation of the next item and are really simple to use. Let's dive in.
Definition of the Generators
Generator functions allow you to declare a function that behaves like an iterator however it's lazy iterator. What does it means? Generator returns object which not contains any of the items at all. You can ask the question: How it can be use in for loops? Really simple. Generators store definition and state of the generation function and produces items only when application ask for. Magic! I'm not good at definitions then please let me jump directly to the examples.
Simple
To make the regular function an generator function you have to add at least one yield
keyword to the body of that function. Interpreter automatically will return a lazy list object called a generator instead an list (here is assumption that we are working with iterators)
def simple_generator(n):
i = 0
while i < n:
yield i
i += 1
LIMIT = 5
gen = simple_generator(LIMIT)
print(gen)
for i in gen:
print(i)
$ python3 ./gen.py
<generator object simple_generator at 0x7ff932ad5b30>
0
1
2
3
4
In comparison simple function behave a bit differently
def simple_function(n):
i = 0
a = list()
while i < n:
a.append(i)
i += 1
return a
LIMIT = 5
fun = simple_function(LIMIT)
print(fun)
for i in fun:
print(i)
$ python3 ./fun.py
[0, 1, 2, 3, 4]
0
1
2
3
4
As you probably noticed in the first case simple_generator returns generator object the rest items are the same
Expression
In addition, generators can be created on the fly using generator expressions which are to be honest anonymous generator functions. To create an generator in that way you have to take a list comprehension and square brackets replace with round parentheses. Quite simple isn't it?
base = [2, 4, 6]
comprehension = [x**2 for x in base]
generator = (x**2 for x in base)
print(comprehension)
print(generator)
$ python3 ./expr.py
[4, 16, 36]
<generator object <genexpr> at 0x7f52b80f1b30>
Use cases in the python world
Generators in python world are used quite often. Let's break it down.
Save memory
There are many reasons to use generators. Above all, it save memory. Storing in memory only one item and the state of the function/expression saves a lot of RAM. This is visible especially using large amounts of data for instance large files or lists.
Example uses file downloaded from here. I get inspired by https://stackoverflow.com/a/45679009/7678529 and finally I've created below piece of code.
from collections import Counter
from time import sleep
import tracemalloc
def regular_func(fname):
fp = open(fname)
return fp.readlines()
def genertor_func(fname):
with open(fname) as fp:
yield fp.readline()
def workoad_reg():
sleep(3)
counts = Counter()
fname = 'wordlist.10000'
words = regular_func(fname)
for word in words:
prefix = word[:4]
counts[prefix] += 1
sleep(0.0001)
most_common = counts.most_common(3)
sleep(4)
return most_common
def workoad_gen():
sleep(3)
counts = Counter()
fname = 'wordlist.10000'
for word in genertor_func(fname):
prefix = word[:4]
counts[prefix] += 1
sleep(0.0001)
most_common = counts.most_common(3)
sleep(4)
return most_common
tracemalloc.start()
res = workoad_reg()
current, peak = tracemalloc.get_traced_memory()
print(f"[Regular] Current memory usage: {current / 10**3}KB; Peak: {peak / 10**3}KB")
tracemalloc.stop()
tracemalloc.start()
res = workoad_gen()
current, peak = tracemalloc.get_traced_memory()
print(f"[Generator] Current memory usage: {current / 10**3}KB; Peak: {peak / 10**3}KB")
tracemalloc.stop()
And output
$ python3 mem.py
[Regular] Current memory usage: 3.043KB; Peak: 990.585KB
[Generator] Current memory usage: 1.542KB; Peak: 22.997KB
As you can see for this example (file size: 70KB) generator saves circa 970 KB
Speed up
For the large list, calculating of the all elements can take a lot of CPU cycles and it pauses execution of the code. In such circumstances stream processing is desirable. Pipeline is moving forward as soon as a new item is ready. Such approach can be forced by generators. In conclusion, generators keep your application moving.
# fun.py
from time import sleep
def simple_function(n):
i = 0
a = list()
while i < n:
sleep(3)
a.append(i)
i += 1
return a
LIMIT = 5
fun = simple_function(LIMIT)
print(fun)
for i in fun:
print(i)
break
# gen
from time import sleep
def simple_generator(n):
i = 0
while i < n:
sleep(3)
yield i
i += 1
LIMIT = 5
gen = simple_generator(LIMIT)
print(gen)
for i in gen:
print(i)
break
Output
$ time python3 gen.py
<generator object simple_generator at 0x7fd54a3a9b30>
0
real 0m3.041s
user 0m0.022s
sys 0m0.007s
$ time python3 fun.py
[0, 1, 2, 3, 4]
0
real 0m15.043s
user 0m0.022s
sys 0m0.000s
In above example, generator jumps to another line in real time. In contrast regular function calculates whole list first and then jumps to next line. Finally, application stacked for few seconds.
Infinity loop generators
Although infinity loop can created using different approaches that one is really flexible.
import requests
def infinite_simple():
num = 0
while True:
yield num
num += 1
def infinite_more_complex():
num = 0
while True:
yield num
res = requests.get("https://google.com")
duration = res.elapsed.total_seconds()
num += duration * 2
i = 0
for val in infinite_more_complex():
print(val)
if i > 5:
break
i += 1
# output
$ python3 inf.py
0
0.356432
0.69979
1.020456
1.305722
1.624428
2.042626
Conclusion
Generators are big part of the python's world. They are really useful, flexible and easy to use. That's why don't be afraid to use it.
Itss nott my first time too pay a qick visit this site,
i am isiting this site dailly and take good information from here all the time.
Way cool! Some extremely valid points! I appreciate you
writing this write-up plus the rest of the website is really good.