擅长:python、mysql、java
<p><code>df</code>的类型不是<code>dataframe</code>,而是<code>TextFileReader</code>。我认为您需要通过函数<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html" rel="nofollow">^{<cd4>}</a>将所有块concat到dataframe,然后应用函数:</p>
<pre><code>df = pd.read_csv('verylargefile.csv', chunksize=10000) # gives TextFileReader
df_chunk = concat(df, ignore_index=True)
df_chunk['new_column'] = df_chunk['old_column'].apply(my_func)
# do other operations and filters...
df_chunk.to_csv('processed.csv', mode='a')
</code></pre>
<p><a href="http://pandas.pydata.org/pandas-docs/stable/io.html#io-chunking" rel="nofollow">More info</a>。在</p>
<p>编辑:</p>
<p>或许有助于这种方法:按组处理大型数据帧:</p>
<p>示例:</p>
^{pr2}$