有 Java 编程相关的问题?


java Stanford NLP注释文本非常慢


Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref, regexner");
StanfordCoreNLP pipeline =   new StanfordCoreNLP(props);
Annotation document = new Annotation("Text to be annotated. This text is very long!");
pipeline.annotate(document); // this line takes a long time

文本的注释占用了相当长的时间。 大约60个单词,这一行大约需要16秒,太长了

有没有办法加快这一进程,或者我有没有遗漏什么。 请告诉我我能做什么。 Thanx提前:-)



    public TextReader() {
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, regexner");
pipeline = new StanfordCoreNLP(props);
extractor = CoreMapExpressionExtractor.
                            createExtractorFromFiles(TokenSequencePattern.getNewEnv(), "Stanford NLP\\stanford-corenlp-full-2015-01-29\\stanford-corenlp-full-2015-01-30\\tokensregex\\color.rules.txt");
text = "Barak Obama was born on August 4, 1961,at Kapiolani Maternity & Gynecological Hospital "
+ " in Honolulu, Hawaii, and would become the first President to have been born in Hawaii. His mother, Stanley Ann Dunham,"
+ " was born in Wichita, Kansas, and was of mostly English ancestry. His father, Barack Obama, Sr., was a Luo from Nyang’oma"
+ " Kogelo, Kenya. He studied at the University of Westminster. His favourite colour is red.";
Logger.getLogger(TextReader.class.getName()).log(Level.INFO, "Annotator starting...", text); // LOG 1
Annotation document = new Annotation(text);
Logger.getLogger(TextReader.class.getName()).log(Level.INFO, "Annotator finished...", props); // LOG 2
sentences = document.get(SentencesAnnotation.class);
for (CoreMap sentence : sentences) {
   //the tokens of the sentence are taken and iterated over
   // the NER, POS and lemma of the tokens are stores iteratively



共 (1) 个答案

  1. # 1 楼答案

