理解Python关键字yield与generator

许毅
2023-12-01
  • yield from

    yield from iterator
    # was equivalent to:
    for x in iterator:
        yield x
    
  • generator

    generators were first introduced by PEP 255 Python2.2. They are also called generator iterators since generators implement the iterator protocal.Especially allowing for the idea of infinite sequences.

    生成器作用类似于迭代器,区别在于当处理较大量级的数据,迭代器需要先生成全部list,会占用较大内存;而迭代器一次只生成一次,下一次使用会在上次基础上再生成一个,资源消耗较少。

    # 迭代器
    def eager_range(up_to):
        """Create a list of integers, from 0 to up_to, exclusive."""
        sequence = []
        index = 0
        while index < up_to:
            sequence.append(index)
            index += 1
        return sequence
    # 生成器
    def lazy_range(up_to):
        """Generator to return the sequence of integers from 0 to up_to, exclusive."""
        index = 0
        while index < up_to:
            yield index
            index += 1
    

    从另一个角度,iterator 是一次性把需要的数据全部生成,而generator是一次生成一个,相当于函数的一pause,需要的时候再resume。当遇到infinite sequences 时明显只能use generator。但此时的generator只能起到类iterator的作用,直到Python2.5的PEP 342使generator在pause的基础上,还可以send stuff back in。

    def jumping_range(up_to):
        """Generator for the sequence of integers from 0 to up_to, exclusive.
    
        Sending a value into the generator will shift the sequence by that amount.
        """
        index = 0
        while index < up_to:
            jump = yield index
            if jump is None:
                jump = 1
            index += jump
    
    
    if __name__ == '__main__':
        iterator = jumping_range(5)
        print(next(iterator))  # 0
        print(iterator.send(2))  # 2
        print(next(iterator))  # 3
        print(iterator.send(-1))  # 2
        for x in iterator:
            print(x)  # 3, 4
    # 案例中iterator是一个生成器generator。
    # 1. geneator可以记忆上下文
    # 2. iterator.send()   next(iterator)两个对generator的操作进入函数后都是从jump = yield index之后一行开始的,send()是将2送给了yield index的结果即jump,默认情况下这个结果jump是None的。
    # 3. 每次重新进入generator是运行了while的一个半周期,从jump = yield index后一行开始运行,到底后重新运行一遍,知道jump = yield index跳出并将此时index的值返回
    

    Python3.3通过PEP 380添加yield from优化generator:

    def lazy_range(up_to):
        """Generator to return the sequence of integers from 0 to up_to, exclusive."""
        index = 0
        def gratuitous_refactor():
            nonlocal index
            while index < up_to:
                yield index
                index += 1
        yield from gratuitous_refactor()
    
  • Generator总结

    1. Python2.2引入generator,实现函数内执行过程中暂停;
    2. Python2.5引入PEP 342,不仅可以暂停,还可以插入数据;使得coroutine的概念成为可能;
    3. Python3.3引入PEP 380,通过yield from优化generator,并可以将chain them together
  • Reference

  1. How the heck does async/await work in Python 3.5?
 类似资料: