擅长:python、mysql、java
<p>不能在<a href="https://docs.python.org/2/library/codecs.html#codecs.open" rel="nofollow">codecs.open</a>中使用<code>regex</code>(或<code>glob</code>扩展)。它需要一个文件名。所以你才会出错。你知道吗</p>
<p>所以你不能这么做:</p>
<pre><code>txt_files = [(codecs.open('/the/path/ofthedirectory/*.txt','r','utf8')).readlines()]
</code></pre>
<p>应该使用<a href="https://docs.python.org/2/library/os.html#os.listdir" rel="nofollow">os.listdir</a>或<a href="https://docs.python.org/2/library/os.html#os.walk" rel="nofollow">os.walk</a>或<a href="https://docs.python.org/2/library/glob.html#glob.iglob" rel="nofollow">glob.iglob</a>(<a href="https://docs.python.org/2/library/glob.html#glob.glob" rel="nofollow">glob.glob</a>迭代器变量)之类的方法,过滤结果,然后打开每个文件。你知道吗</p>
<p>所以你会得到这样的结果:</p>
<pre><code># filter to have only txts
txt_files = [p for p in os.listdir('/path/to/dir') if p.endswith('.txt')]
# do your filtering
important_stuff = re.findall("(\S+)\s+(NC\S+).*\n.*\s(\S+)\s+(AQ\S+)", txt_files)
</code></pre>