<p>您的代码,稍作修改,不需要unittest:</p>
<pre><code>import re
def extract_data(body):
for i in body:
a = re.sub('<[^<]+?>', '', str(i))
b = re.sub('view\xc2\xa0book\xc2\xa0info', '', str(a))
c = re.sub('key', '', str(b))
d = re.sub('\xc2', ' ', str(c))
e = re.sub('\xa0', '', str(d))
yield e
def test_extract_data():
sample_input = ['<tr><h1>keyThis</h1><h2>\xc2</h2><h3>\xa0</h3><h4>view\xc2\xa0book\xc2\xa0info</h4><h5>Test Passes</h5></tr>']
expected_res = 'This Test Passes'
res = extract_data(sample_input)
return expected_res == res
print(test_extract_data())
</code></pre>
<p>这将打印<code>False</code></p>
<p>问题是,当您执行<code>return</code>时,在您的例子中,函数返回一个<code>str</code>。然而,{cd4{cd6>返回<cd6}类型的对象。例如:</p>
^{pr2}$
<p>这将打印<code>True</code>。在</p>
<p>为了说明,在<a href="https://docs.python.org/2/using/cmdline.html" rel="nofollow">Python command prompt</a>:</p>
<pre><code>>>> type("hello")
<class 'str'>
>>> def gen():
... yield "hello"
...
>>> type(gen())
<class 'generator'>
</code></pre>
<p>您的另一个选择(可能更好,取决于您的用例)是通过将<code>generator</code>对象的结果转换为<code>list</code>或{<cd12>}来测试<code>generator</code>的所有结果是否正确,然后比较是否相等:</p>
<pre><code>import re
def extract_data(body):
for i in body:
a = re.sub('<[^<]+?>', '', str(i))
b = re.sub('view\xc2\xa0book\xc2\xa0info', '', str(a))
c = re.sub('key', '', str(b))
d = re.sub('\xc2', ' ', str(c))
e = re.sub('\xa0', '', str(d))
yield e
def test_extract_data():
sample_input = ['<tr><h1>keyThis</h1><h2>\xc2</h2><h3>\xa0</h3><h4>view\xc2\xa0book\xc2\xa0info</h4><h5>Test Passes</h5></tr>', '<tr><h1>keyThis</h1><h2>\xc2</h2><h3>\xa0</h3><h4>view\xc2\xa0book\xc2\xa0info</h4><h5>Test Passes Too!</h5></tr>']
expected_res = ['This Test Passes', 'This Test Passes Too!']
res = extract_data(sample_input)
return expected_res == list(res)
print(test_extract_data())
</code></pre>