<p>以防万一可能会对某人有所帮助:</p>
<p>我正为同样的问题而挣扎(同样的<a href="https://github.com/tylin/coco-caption" rel="nofollow noreferrer">https://github.com/tylin/coco-caption</a>代码)。可能与我在CentOS上使用<code>qsub</code>运行Python3.7的代码有关。所以我改变了</p>
<pre><code>cmd = ['java', '-cp', 'stanford-corenlp-3.4.1.jar', 'edu.stanford.nlp.process.PTBTokenizer', '-preserveLines', '-lowerCase', 'tmpWS5p0Z']
</code></pre>
<p>到</p>
^{pr2}$
<p>使用绝对路径修复了<code>OSError: [Errno 2] No such file or directory</code>。请注意,我仍然将<code>'/abs/path/to/temporary_file'</code>作为<code>cmd</code>列表中的第二个元素,因为它是在稍后添加的。但后来,记号发生器java子进程出了问题,我不知道原因或原因,只是观察一下,因为:</p>
<pre><code>p_tokenizer = subprocess.Popen(cmd, cwd=path_to_jar_dirname, stdout=subprocess.PIPE, shell=True)
token_lines = p_tokenizer.communicate(input=sentences.rstrip())[0]
</code></pre>
<p>这里<code>token_lines</code>是一个空列表(这不是想要的行为)。{{cd7>这不仅仅导致了<cd6}的执行。在</p>
<pre><code>Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: java.io.IOException: Input/output error
at edu.stanford.nlp.process.PTBTokenizer.getNext(PTBTokenizer.java:278)
at edu.stanford.nlp.process.PTBTokenizer.getNext(PTBTokenizer.java:163)
at edu.stanford.nlp.process.AbstractTokenizer.hasNext(AbstractTokenizer.java:55)
at edu.stanford.nlp.process.PTBTokenizer.tokReader(PTBTokenizer.java:444)
at edu.stanford.nlp.process.PTBTokenizer.tok(PTBTokenizer.java:416)
at edu.stanford.nlp.process.PTBTokenizer.main(PTBTokenizer.java:760)
Caused by: java.io.IOException: Input/output error
at java.base/java.io.FileInputStream.readBytes(Native Method)
at java.base/java.io.FileInputStream.read(FileInputStream.java:279)
at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:290)
at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:351)
at java.base/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.base/java.io.InputStreamReader.read(InputStreamReader.java:185)
at java.base/java.io.BufferedReader.read1(BufferedReader.java:210)
at java.base/java.io.BufferedReader.read(BufferedReader.java:287)
at edu.stanford.nlp.process.PTBLexer.zzRefill(PTBLexer.java:24511)
at edu.stanford.nlp.process.PTBLexer.next(PTBLexer.java:24718)
at edu.stanford.nlp.process.PTBTokenizer.getNext(PTBTokenizer.java:276)
... 5 more
</code></pre>
<p>再说一遍,我不知道为什么或是什么,但我只想和大家分享这样做可以修复它:</p>
<pre><code>cmd = ['/abs/path/to/java -cp /abs/path/to/stanford-corenlp-3.4.1.jar edu.stanford.nlp.process.PTBTokenizer -preserveLines -lowerCase /abs/path/to/temporary_file']
</code></pre>
<p>并将<code>cmd.append(os.path.join(path_to_jar_dirname, os.path.basename(tmp_file.name)))</code>改为{<cd9>}。在</p>
<p>因此,将<code>cmd</code>放入一个只有1个元素的列表中,同时包含具有绝对路径的整个命令。谢谢你的帮助!在</p>