<p>通过更仔细地研究C代码,我发现这种明显的矛盾是由于<code>ratio</code>对待“替换”编辑操作与对待其他操作不同(即代价为2),而<code>distance</code>对待它们都一样,代价为1。</p>
<p>这可以在<code>ratio_py</code>函数中对内部<code>levenshtein_common</code>函数的调用中看到:</p>
<hr/>
<p><a href="https://github.com/miohtama/python-Levenshtein/blob/master/Levenshtein.c#L727">https://github.com/miohtama/python-Levenshtein/blob/master/Levenshtein.c#L727</a></p>
<pre><code>static PyObject*
ratio_py(PyObject *self, PyObject *args)
{
size_t lensum;
long int ldist;
if ((ldist = levenshtein_common(args, "ratio", 1, &lensum)) < 0) //Call
return NULL;
if (lensum == 0)
return PyFloat_FromDouble(1.0);
return PyFloat_FromDouble((double)(lensum - ldist)/(lensum));
}
</code></pre>
<hr/>
<p>通过<code>distance_py</code>函数:</p>
<p><a href="https://github.com/miohtama/python-Levenshtein/blob/master/Levenshtein.c#L715">https://github.com/miohtama/python-Levenshtein/blob/master/Levenshtein.c#L715</a></p>
<pre><code>static PyObject*
distance_py(PyObject *self, PyObject *args)
{
size_t lensum;
long int ldist;
if ((ldist = levenshtein_common(args, "distance", 0, &lensum)) < 0)
return NULL;
return PyInt_FromLong((long)ldist);
}
</code></pre>
<hr/>
<p>最终导致不同的成本参数被发送到另一个内部函数<code>lev_edit_distance</code>,该函数包含以下文档片段:</p>
<pre><code>@xcost: If nonzero, the replace operation has weight 2, otherwise all
edit operations have equal weights of 1.
</code></pre>
<p>lev_edit_distance()的代码:</p>
<pre><code>/**
* lev_edit_distance:
* @len1: The length of @string1.
* @string1: A sequence of bytes of length @len1, may contain NUL characters.
* @len2: The length of @string2.
* @string2: A sequence of bytes of length @len2, may contain NUL characters.
* @xcost: If nonzero, the replace operation has weight 2, otherwise all
* edit operations have equal weights of 1.
*
* Computes Levenshtein edit distance of two strings.
*
* Returns: The edit distance.
**/
_LEV_STATIC_PY size_t
lev_edit_distance(size_t len1, const lev_byte *string1,
size_t len2, const lev_byte *string2,
int xcost)
{
size_t i;
</code></pre>
<hr/>
<p><strong>[回答]</strong></p>
<p>所以在我的例子中</p>
<p><code>ratio('ab', 'ac')</code>意味着在字符串(4)的总长度上进行替换操作(成本为2),因此<code>2/4 = 0.5</code>。</p>
<p>这就解释了“如何”,我想剩下的唯一方面就是“为什么”,但目前我对这种理解感到满意。</p>