NLTK找不到java文件!斯坦福POS Tagg

2024-10-01 15:37:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我一直在努力让斯坦福的POS标签工工作了一段时间。从一个old SO post我发现了以下(稍作修改)代码:

stanford_dir = 'C:/Users/.../stanford-postagger-2017-06-09/'

from nltk.tag import StanfordPOSTagger
#from nltk.tag.stanford import StanfordPOSTagger # I tried it both ways
from nltk import word_tokenize

# Add the jar and model via their path (instead of setting environment variables):
jar = stanford_dir + 'stanford-postagger.jar'
model = stanford_dir + 'models/english-left3words-distsim.tagger'

pos_tagger = StanfordPOSTagger(model, jar, encoding='utf8')

text = pos_tagger.tag(word_tokenize("What's the airspeed of an unladen swallow ?"))
print(text)

但是,我得到了以下错误:

^{pr2}$

我不知道它在说什么java文件。我确信它找到了正确的文件,因为如果我将路径更改为不正确的内容,则会出现不同的错误:

LookupError: Could not find stanford-postagger.jar jar file at C:/Users/.../stanford-postagger-2017-06-09/sstanford-postagger.jar

缺少什么java文件?我怎样才能让Stanford POS tagger工作?在

编辑:

我去了这个link for Stanford NLP on Windows,试着:

(第二次编辑-添加安装过程):

import urllib.request
import zipfile
urllib.request.urlretrieve(r'http://nlp.stanford.edu/software/stanford-postagger-full-2015-04-20.zip', r'C:/Users/HMISYS/Downloads/stanford-postagger-full-2015-04-20.zip')
zfile = zipfile.ZipFile(r'C:/Users/HMISYS/Downloads/stanford-postagger-full-2015-04-20.zip')
zfile.extractall(r'C:/Users/HMISYS/Downloads/')
# End second edit

from nltk.tag.stanford import StanfordPOSTagger
# Trying on an older version
_model_filename = r'C:/Users/HMISYS/Downloads/stanford-postagger-full-2015-04-20/models/english-bidirectional-distsim.tagger'
_path_to_jar = r'C:/Users/HMISYS/Downloads/stanford-postagger-full-2015-04-20/stanford-postagger.jar'
st = StanfordPOSTagger(model_filename=_model_filename, path_to_jar=_path_to_jar)
text = st.tag(nltk.word_tokenize("What's the airspeed of an unladen swallow ?"))
print(text)

但我也犯了同样的错误。基于this post,我使用以下内容设置路径变量:

set STANFORDTOOLSDIR=$HOME
set CLASSPATH=$STANFORDTOOLSDIR/stanford-postagger-full-2015-04-20/stanford-postagger.jar
set export STANFORD_MODELS=$STANFORDTOOLSDIR/stanford-postagger-full-2015-04-20/models

但我得到一个错误:

NLTK was unable to find stanford-postagger.jar! Set the CLASSPATH environment variable.

Tags: thefromimportmodeldownloadstaguserstagger

热门问题