<p><strong>TL;DR</strong>:</p>
<pre><code>>>> import nltk
>>> hypothesis = ['This', 'is', 'cat']
>>> reference = ['This', 'is', 'a', 'cat']
>>> references = [reference] # list of references for 1 sentence.
>>> list_of_references = [references] # list of references for all sentences in corpus.
>>> list_of_hypotheses = [hypothesis] # list of hypotheses that corresponds to list of references.
>>> nltk.translate.bleu_score.corpus_bleu(list_of_references, list_of_hypotheses)
0.6025286104785453
>>> nltk.translate.bleu_score.sentence_bleu(references, hypothesis)
0.6025286104785453
</code></pre>
<p>(注意:为了获得BLEU score实现的稳定版本,必须在<code>develop</code>分支上提取最新版本的NLTK)</p>
<hr/>
<p>在Long中:</p>
<p>实际上,如果整个语料库中只有一个引用和一个假设,那么<code>corpus_bleu()</code>和<code>sentence_bleu()</code>应该返回与上面示例中相同的值。</p>
<p>在代码中,我们看到<a href="https://github.com/nltk/nltk/blob/develop/nltk/translate/bleu_score.py#L26" rel="noreferrer">^{<cd4>} is actually a duck-type of ^{<cd5>}</a>:</p>
<pre><code>def sentence_bleu(references, hypothesis, weights=(0.25, 0.25, 0.25, 0.25),
smoothing_function=None):
return corpus_bleu([references], [hypothesis], weights, smoothing_function)
</code></pre>
<p>如果我们看看<code>sentence_bleu</code>的参数:</p>
<pre><code> def sentence_bleu(references, hypothesis, weights=(0.25, 0.25, 0.25, 0.25),
smoothing_function=None):
""""
:param references: reference sentences
:type references: list(list(str))
:param hypothesis: a hypothesis sentence
:type hypothesis: list(str)
:param weights: weights for unigrams, bigrams, trigrams and so on
:type weights: list(float)
:return: The sentence-level BLEU score.
:rtype: float
"""
</code></pre>
<p><code>sentence_bleu</code>引用的输入是<code>list(list(str))</code>。</p>
<p>因此,如果你有一个句子字符串,例如<code>"This is a cat"</code>,你必须对它进行标记化才能得到一个字符串列表<code>["This", "is", "a", "cat"]</code>,并且由于它允许多个引用,因此它必须是一个字符串列表,例如,如果你有第二个引用,“这是一只猫”,你对<code>sentence_bleu()</code>的输入是:</p>
<pre><code>references = [ ["This", "is", "a", "cat"], ["This", "is", "a", "feline"] ]
hypothesis = ["This", "is", "cat"]
sentence_bleu(references, hypothesis)
</code></pre>
<p>当谈到<code>corpus_bleu()</code>list_of_references参数时,它基本上是<a href="https://github.com/nltk/nltk/blob/develop/nltk/translate/bleu_score.py#L82" rel="noreferrer">a list of whatever the ^{<cd3>} takes as references</a>:</p>
<pre><code>def corpus_bleu(list_of_references, hypotheses, weights=(0.25, 0.25, 0.25, 0.25),
smoothing_function=None):
"""
:param references: a corpus of lists of reference sentences, w.r.t. hypotheses
:type references: list(list(list(str)))
:param hypotheses: a list of hypothesis sentences
:type hypotheses: list(list(str))
:param weights: weights for unigrams, bigrams, trigrams and so on
:type weights: list(float)
:return: The corpus-level BLEU score.
:rtype: float
"""
</code></pre>
<p>除了查看<a href="https://github.com/nltk/nltk/blob/develop/nltk/translate/bleu_score.py" rel="noreferrer">^{<cd14>}</a>中的doctest之外,还可以查看<a href="https://github.com/nltk/nltk/blob/develop/nltk/test/unit/translate/test_bleu.py" rel="noreferrer">^{<cd15>}</a>中的unittest,了解如何使用<code>bleu_score.py</code>中的每个组件。</p>
<p>顺便说一句,因为<code>sentence_bleu</code>作为<code>bleu</code>导入到(<code>nltk.translate.__init__.py</code>](<a href="https://github.com/nltk/nltk/blob/develop/nltk/translate/__init__.py#L21" rel="noreferrer">https://github.com/nltk/nltk/blob/develop/nltk/translate/<strong>init</strong>.py#L21</a>)中,使用</p>
<pre><code>from nltk.translate import bleu
</code></pre>
<p>与以下相同:</p>
<pre><code>from nltk.translate.bleu_score import sentence_bleu
</code></pre>
<p>在代码中:</p>
<pre><code>>>> from nltk.translate import bleu
>>> from nltk.translate.bleu_score import sentence_bleu
>>> from nltk.translate.bleu_score import corpus_bleu
>>> bleu == sentence_bleu
True
>>> bleu == corpus_bleu
False
</code></pre>