<p>你想做这样的事吗</p>
<pre><code># modified the data to make it read_clipboard friendly
'''
name IS_030_EBITDA IS_09_PostTaxResult
0 EISMA_MEDIA_GROEP_B.V. NaN 1292.0
1 EISMA_MEDIA_GROEP_B.V. 2280.0 1324.0
2 DUNLOP_B.V. 43433.0 1243392.0
3 DUNLOP_B.V. 2243480.0 1324.0
'''
df = pd.read_clipboard()
print(df)
df_sample=df.sample(2) # refer to the 'Note' section below
df_sample[['IS_09_PostTaxResult', 'IS_030_EBITDA']]='NaN'
df.update(df_sample)
print(df)
</code></pre>
<p></p>
<p>df原件:</p>
<pre><code> name IS_030_EBITDA IS_09_PostTaxResult
0 EISMA_MEDIA_GROEP_B.V. NaN 1292.0
1 EISMA_MEDIA_GROEP_B.V. 2280.0 1324.0
2 DUNLOP_B.V. 43433.0 1243392.0
3 DUNLOP_B.V. 2243480.0 1324.0
</code></pre>
<p>df修改:</p>
<pre><code> name IS_030_EBITDA IS_09_PostTaxResult
0 EISMA_MEDIA_GROEP_B.V. NaN NaN
1 EISMA_MEDIA_GROEP_B.V. 2280 1324
2 DUNLOP_B.V. 43433 1.24339e+06
3 DUNLOP_B.V. NaN NaN
</code></pre>
<p>注:</p>
<p>“df_样本=df.样本(2)”——>;您可以添加逻辑以选择总样本记录的25%,并替换值2。例如:</p>
<pre><code># 25% data in each column
x=25.0
factor = int((len(df)*x)/100) # factor=1 in the example above
df_sample=df.sample(factor)
</code></pre>