itertools.takewhile在生成器函数中，为什么只计算一次？

def readParag(fileObj): currentParag = [] for line in fileObj: stripped = line.rstrip() if len(stripped) > 0: currentParag.append(stripped) elif len(currentParag) > 0: yield currentParag currentParag = []

3条回答

网友

1楼 · 编辑于 2024-10-01 15:44:32

这正是.takewhile()的行为方式。当条件为真时，它将从底层iterable返回元素，一旦条件为false，它将永远切换到迭代完成阶段。在

注意，这就是迭代器的行为方式；提升StopIteration意味着，停止在我身上迭代，我完成了。在

从python glossary on "iterator"：

An object representing a stream of data. Repeated calls to the iterator’s next() method return successive items in the stream. When no more data are available a StopIteration exception is raised instead. At this point, the iterator object is exhausted and any further calls to its next() method just raise StopIteration again.

您可以将takewhile与{}结合起来，看看下一批中是否还有更多结果：

import itertools

def readParag(filename):
    with open(filename) as f:
        while True:
            paras = itertools.takewhile(lambda l: l.strip(), f)
            test, paras = itertools.tee(paras)
            test.next()  # raises StopIteration when the file is done
            yield (l.strip() for l in paras)

这就产生了生成器，所以每个生成的项本身就是一个生成器。您确实需要使用这些生成器中的所有元素才能继续工作；对于另一个答案中列出的groupby方法也是如此。在

网友

2楼 · 编辑于 2024-10-01 15:44:32

其他答案很好地解释了这里发生了什么，您需要多次调用takewhile，而您当前的生成器没有这样做。下面是一个相当简洁的方法，可以使用带有sentinel参数的内置^{}函数来获得所需的行为：

from itertools import takewhile

def readParag(fileObj):
    cond = lambda line: line != "\n"
    return iter(lambda: [ln.rstrip() for ln in takewhile(cond, fileObj)], [])

网友

3楼 · 编辑于 2024-10-01 15:44:32

您要做的是^{}的完美工作：

from itertools import groupby

def read_parag(filename):
    with open(filename) as f:
        for k,g in groupby((line.strip() for line in f), bool):
            if k:
                yield list(g)

这将提供：

^{pr2}$

或者在一行中：

[list(g) for k,g in groupby((line.strip() for line in open('myfile.txt')), bool) if k]

相关问题更多 >

编程相关推荐

热门问题

热门文章