python基础之四

本博客所有文章仅仅是博主做笔记之用,博客内容并不详细(以后有空会修改完善),思维也有跳跃之处,想详细学习博客内容可参考文章后面的参考链接,祝学习快乐。


生成器和迭代器

生成器

通过列表生成式,我们可以直接创建一个列表。但是,受到内存限制,列表容量肯定是有限的。而且,创建一个包含100万个元素的列表,不仅占用很大的存储空间,如果我们仅仅需要访问前面几个元素,那后面绝大多数元素占用的空间都白白浪费了。

所以,如果列表元素可以按照某种算法推算出来,那我们是否可以在循环的过程中不断推算出后续的元素呢?这样就不必创建完整的list,从而节省大量的空间。在Python中,这种一边循环一边计算的机制,称为生成器:generator。

要创建一个generator,有很多种方法。第一种方法很简单,只要把一个列表生成式的[]改成(),就创建了一个generator:

>>> L = [x*x for x in range(10)]
>>> L
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> g=(x*x for x in range(10))
>>> g
<generator object <genexpr> at 0x036B28D0>
>>> 

创建L和g的区别仅在于最外层的[]和(),L是一个list,而g是一个generator。可以通过next()函数获得generator的下一个返回值:

>>> next(g)
0
>>> next(g)
1
>>> next(g)
4
>>> next(g)
9
>>> next(g)
16
>>> next(g)
25
>>> next(g)
36
>>> next(g)
49
>>> next(g)
64
>>> next(g)
81
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

generator保存的是算法,每次调用next(g),就计算出g的下一个元素的值,直到计算到最后一个元素,没有更多的元素时,抛出StopIteration的错误。

yield

关于yield

The yield statement is only used when defining a generator function, and is only used in the body of the generator function. Using a yield statement in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.
When a generator function is called, it returns an iterator known as a generator iterator, or more commonly, a generator. The body of the generator function is executed by calling the generator’s next() method repeatedly until it raises an exception.
When a yield statement is executed, the state of the generator is frozen and the value of expression_list is returned to next()’s caller. By “frozen” we mean that all local state is retained, including the current bindings of local variables, the instruction pointer, and the internal evaluation stack: enough information is saved so that the next time next() is invoked, the function can proceed exactly as if the yield statement were just another external call.
The yield statement is not allowed in the try clause of a try … finally construct. The difficulty is that there’s no guarantee the generator will ever be resumed, hence no guarantee that the finally block will ever get executed.

generator和函数的执行流程不一样。函数是顺序执行,遇到return语句或者最后一行函数语句就返回。而变成generator的函数,在每次调用next()的时候执行,遇到yield语句返回,再次执行时从上次返回的yield语句处继续执行。

还可通过yield实现在单线程的情况下实现并发运算的效果 

#_*_coding:utf-8_*_
__author__ = 'Alex Li'

import time
def consumer(name):
    print("%s 准备吃包子啦!" %name)
    while True:
       baozi = yield

       print("包子[%s]来了,被[%s]吃了!" %(baozi,name))


def producer(name):
    c = consumer('A')
    c2 = consumer('B')
    c.__next__()
    c2.__next__()
    print("老子开始准备做包子啦!")
    for i in range(10):
        time.sleep(1)
        print("做了2个包子!")
        c.send(i)
        c2.send(i)

producer("alex")

yield from

python 3新加内容

yield from expression, allowing a generator to delegate part of its operations to another generator. This allows a section of code containing yield to be factored out and placed in another generator. Additionally, the subgenerator is allowed to return with a value, and the value is made available to the delegating generator.

While designed primarily for use in delegating to a subgenerator, the yield from expression actually allows delegation to arbitrary subiterators.

For simple iterators, yield from iterable is essentially just a shortened form of for item in iterable: yield item

>>> def g(x):
    yield from range(x, 0, -1)
    yield from range(x)
... ... ... 
>>> list(g(5))
[5, 4, 3, 2, 1, 0, 1, 2, 3, 4]

However, unlike an ordinary loop, yield from allows subgenerators to receive sent and thrown values directly from the calling scope, and return a final value to the outer generator

>>> def accumulate():
...     tally = 0
...     while 1:
...         next = yield
...         if next is None:
...             return tally
...         tally += next
...
>>> def gather_tallies(tallies):
...     while 1:
...         tally = yield from accumulate()
...         tallies.append(tally)
...
>>> tallies = []
>>> acc = gather_tallies(tallies)
>>> next(acc)  # Ensure the accumulator is ready to accept values
>>> for i in range(4):
...     acc.send(i)
...
>>> acc.send(None)  # Finish the first tally
>>> for i in range(5):
...     acc.send(i)
...
>>> acc.send(None)  # Finish the second tally
>>> tallies
[6, 10]

The main principle driving this change is to allow even generators that are designed to be used with the send and throw methods to be split into multiple subgenerators as easily as a single large function can be split into multiple subfunctions.

迭代器

可以直接作用于for循环的数据类型有以下几种:

一类是集合数据类型,如list、tuple、dict、set、str等;

一类是generator,包括生成器和带yield的generator function。

这些可以直接作用于for循环的对象统称为可迭代对象:Iterable。

可以使用isinstance()判断一个对象是否是Iterable对象:

>>> from collections import Iterable
>>> isinstance([], Iterable)
True
>>> isinstance({}, Iterable)
True
>>> isinstance('abc', Iterable)
True
>>> isinstance((x for x in range(10)), Iterable)
True
>>> isinstance(100, Iterable)
False

而生成器不但可以作用于for循环,还可以被next()函数不断调用并返回下一个值,直到最后抛出StopIteration错误表示无法继续返回下一个值了。

*可以被next()函数调用并不断返回下一个值的对象称为迭代器:Iterator。

>>> from collections import Iterator
>>> isinstance((x for x in range(10)), Iterator)
True
>>> isinstance([], Iterator)
False
>>> isinstance({}, Iterator)
False
>>> isinstance('abc', Iterator)
False

生成器都是Iterator对象,但list、dict、str虽然是Iterable,却不是Iterator。

把list、dict、str等Iterable变成Iterator可以使用iter()函数:

>>> isinstance(iter([]), Iterator)
True
>>> isinstance(iter('abc'), Iterator)
True

为什么list、dict、str等数据类型不是Iterator?

这是因为Python的Iterator对象表示的是一个数据流,Iterator对象可以被next()函数调用并不断返回下一个数据,直到没有数据时抛出StopIteration错误。可以把这个数据流看做是一个有序序列,但我们却不能提前知道序列的长度,只能不断通过next()函数实现按需计算下一个数据,所以Iterator的计算是惰性的,只有在需要返回下一个数据时它才会计算。

Iterator甚至可以表示一个无限大的数据流,例如全体自然数。而使用list是永远不可能存储全体自然数的。

小结

凡是可作用于for循环的对象都是Iterable类型;

凡是可作用于next()函数的对象都是Iterator类型,它们表示一个惰性计算的序列;

集合数据类型如list、dict、str等是Iterable但不是Iterator,不过可以通过iter()函数获得一个Iterator对象。

Python的for循环本质上就是通过不断调用next()函数实现的,例如:

for x in [1, 2, 3, 4, 5]:
    pass

实际上完全等价于:

# 首先获得Iterator对象:
it = iter([1, 2, 3, 4, 5])
# 循环:
while True:
    try:
        # 获得下一个值:
        x = next(it)
    except StopIteration:
        # 遇到StopIteration就退出循环
        break

参考资料
1. http://www.cnblogs.com/alex3714/articles/5765046.html
2. https://docs.python.org/3/whatsnew/3.3.html#pep-380
3. http://www.cnblogs.com/wupeiqi/articles/4980620.html

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章