擅长:python、mysql、java
<p>检测一种语言,除非html标记将当前的语言</p>
<p>如果在python中使用selenium,则可以使用此函数,为此需要安装nltk和语料库stopwords:</p>
<pre><code>from nltk import word_tokenize
from nltk.corpus import stopwords
def detect_lang(text):
lang_ratios = {}
tokens = word_tokenize(text)
words = [word.lower() for word in tokens]
for language in stopwords.fileids():
stopwords_set = set(stopwords.words(language))
words_set = set(words)
common_elements = words_set.intersection(stopwords_set)
lang_ratios[language] = len(common_elements)
return max(lang_ratios, key=lang_ratios.get)
</code></pre>
<p>使用此函数,您可以请求使用的语言:</p>
^{pr2}$