<p>可以使用这个Python 3库来计算句子相似性:<a href="https://github.com/UKPLab/sentence-transformers" rel="nofollow noreferrer">https://github.com/UKPLab/sentence-transformers</a></p>
<p>来自<a href="https://www.sbert.net/docs/usage/semantic_textual_similarity.html" rel="nofollow noreferrer">https://www.sbert.net/docs/usage/semantic_textual_similarity.html</a>的代码示例:</p>
<pre><code>from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer('paraphrase-MiniLM-L12-v2')
# Two lists of sentences
sentences1 = ['The cat sits outside',
'A man is playing guitar',
'The new movie is awesome']
sentences2 = ['The dog plays in the garden',
'A woman watches TV',
'The new movie is so great']
#Compute embedding for both lists
embeddings1 = model.encode(sentences1, convert_to_tensor=True)
embeddings2 = model.encode(sentences2, convert_to_tensor=True)
#Compute cosine-similarits
cosine_scores = util.pytorch_cos_sim(embeddings1, embeddings2)
#Output the pairs with their score
for i in range(len(sentences1)):
print("{} \t\t {} \t\t Score: {:.4f}".format(sentences1[i], sentences2[i], cosine_scores[i][i]))
</code></pre>
<p>该库包含最先进的句子嵌入模型</p>
<p>请参见<a href="https://stackoverflow.com/a/68728666/395857">https://stackoverflow.com/a/68728666/395857</a>以执行句子聚类</p>