Python Generators

tuning, engine block, tuned-2157354.jpg

Generators in python are cool. They help developers to deal with long lists, increase the performance of the calculation of the next item and are really simple to use. Let's dive in.

Definition of the Generators

Generator functions allow you to declare a function that behaves like an iterator however it's lazy iterator. What does it means? Generator returns object which not contains any of the items at all. You can ask the question: How it can be use in for loops? Really simple. Generators store definition and state of the generation function and produces items only when application ask for. Magic! I'm not good at definitions then please let me jump directly to the examples.

Simple

To make the regular function an generator function you have to add at least one yield keyword to the body of that function. Interpreter automatically will return a lazy list object called a generator instead an list (here is assumption that we are working with iterators)

def simple_generator(n):
    i = 0
    while i < n:
        yield i
        i += 1

LIMIT = 5
gen = simple_generator(LIMIT)
print(gen)
for i in gen:
    print(i)


$ python3 ./gen.py 
<generator object simple_generator at 0x7ff932ad5b30>
0
1
2
3
4

In comparison simple function behave a bit differently

def simple_function(n):
    i = 0
    a = list()
    while i < n:
        a.append(i)
        i += 1
    return a

LIMIT = 5
fun = simple_function(LIMIT)
print(fun)
for i in fun:
    print(i)

$ python3 ./fun.py 
[0, 1, 2, 3, 4]
0
1
2
3
4

As you probably noticed in the first case simple_generator returns generator object the rest items are the same

Expression

In addition, generators can be created on the fly using generator expressions which are to be honest anonymous generator functions. To create an generator in that way you have to take a list comprehension and square brackets replace with round parentheses. Quite simple isn't it?

base = [2, 4, 6]

comprehension = [x**2 for x in base]
generator = (x**2 for x in base)

print(comprehension)
print(generator)

$ python3 ./expr.py 
[4, 16, 36]
<generator object <genexpr> at 0x7f52b80f1b30>

Use cases in the python world

Generators in python world are used quite often. Let's break it down.

Save memory

There are many reasons to use generators. Above all, it save memory. Storing in memory only one item and the state of the function/expression saves a lot of RAM. This is visible especially using large amounts of data for instance large files or lists.

storage, storage medium, hard drive-870713.jpg

Example uses file downloaded from here. I get inspired by https://stackoverflow.com/a/45679009/7678529 and finally I've created below piece of code.

from collections import Counter
from time import sleep
import tracemalloc


def regular_func(fname):
    fp = open(fname)
    return fp.readlines()

def genertor_func(fname):
    with open(fname) as fp:
        yield fp.readline()

def workoad_reg():
    sleep(3)
    counts = Counter()
    fname = 'wordlist.10000'
    words = regular_func(fname)
    for word in words:
        prefix = word[:4]
        counts[prefix] += 1
        sleep(0.0001)
    most_common = counts.most_common(3)
    sleep(4)
    return most_common

def workoad_gen():
    sleep(3)
    counts = Counter()
    fname = 'wordlist.10000'
    for word in genertor_func(fname):
        prefix = word[:4]
        counts[prefix] += 1
        sleep(0.0001)
    most_common = counts.most_common(3)
    sleep(4)
    return most_common


tracemalloc.start()
res = workoad_reg()
current, peak = tracemalloc.get_traced_memory()
print(f"[Regular] Current memory usage: {current / 10**3}KB; Peak: {peak / 10**3}KB")
tracemalloc.stop()


tracemalloc.start()
res = workoad_gen()
current, peak = tracemalloc.get_traced_memory()
print(f"[Generator] Current memory usage: {current / 10**3}KB; Peak: {peak / 10**3}KB")
tracemalloc.stop()

And output

$ python3 mem.py 
[Regular] Current memory usage: 3.043KB; Peak: 990.585KB
[Generator] Current memory usage: 1.542KB; Peak: 22.997KB

As you can see for this example (file size: 70KB) generator saves circa 970 KB

Speed up

For the large list, calculating of the all elements can take a lot of CPU cycles and it pauses execution of the code. In such circumstances stream processing is desirable. Pipeline is moving forward as soon as a new item is ready. Such approach can be forced by generators. In conclusion, generators keep your application moving.

# fun.py
from time import sleep
def simple_function(n):
    i = 0
    a = list()
    while i < n:
        sleep(3)
        a.append(i)
        i += 1
    return a

LIMIT = 5
fun = simple_function(LIMIT)
print(fun)
for i in fun:
    print(i)
    break
# gen
from time import sleep
def simple_generator(n):
    i = 0
    while i < n:
        sleep(3)
        yield i
        i += 1

LIMIT = 5
gen = simple_generator(LIMIT)
print(gen)
for i in gen:
    print(i)
    break

Output

$ time python3 gen.py 
<generator object simple_generator at 0x7fd54a3a9b30>
0

real    0m3.041s
user    0m0.022s
sys     0m0.007s

$ time python3 fun.py 
[0, 1, 2, 3, 4]
0

real    0m15.043s
user    0m0.022s
sys     0m0.000s

In above example, generator jumps to another line in real time. In contrast regular function calculates whole list first and then jumps to next line. Finally, application stacked for few seconds.

Infinity loop generators

road, infinite, asphalt-601871.jpg

Although infinity loop can created using different approaches that one is really flexible.

import requests

def infinite_simple():
    num = 0
    while True:
        yield num
        num += 1

def infinite_more_complex():
    num = 0
    while True:
        yield num
        res = requests.get("https://google.com")
        duration = res.elapsed.total_seconds()
        num += duration * 2

i = 0
for val in infinite_more_complex():
    print(val)
    if i > 5:
        break
    i += 1

# output
$ python3 inf.py 
0
0.356432
0.69979
1.020456
1.305722
1.624428
2.042626

Conclusion

Generators are big part of the python's world. They are really useful, flexible and easy to use. That's why don't be afraid to use it.

2 Comments

  1. Itss nott my first time too pay a qick visit this site,
    i am isiting this site dailly and take good information from here all the time.

  2. Way cool! Some extremely valid points! I appreciate you
    writing this write-up plus the rest of the website is really good.

Leave a Reply

Your email address will not be published. Required fields are marked *