擅长:python、mysql、java
<p>如果必须使用CSV文件,请尝试以下方法:</p>
<pre><code>fn = '/Volumes/big-flash-drive/asdf/FaceImageCroppedWithAlignment.tsv'
cols = ["Freebase MID","EntityNameString","ImageURL", "FaceID",
"FaceRectangle_Base64Encoded","FaceData_Base64Encoded"]
chunks = pd.read_csv(fn, sep='\t', chunksize=10**5, names=cols)
df = pd.concat([x.query("index == 'm.0107_f'") for x in chunks], ignore_index=True)
</code></pre>
<p>如果您可以以不同的格式存储数据-我强烈建议您使用HDF5格式或将数据存储在RDBMS数据库中:</p>
<p>演示:</p>
<pre><code>df = pd.read_hdf('/path/to/file.h5', 'hdf_key', where="index == 'm.0107_f'")
</code></pre>
<p>这将只读取满足<code>where</code>子句的行</p>