回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我有两个相同的数据帧<code>new</code>和<code>old</code>。<code>new</code>数据帧将在一天中随机更新。下面的代码检查是否有任何更改</p>
<pre><code>import pandas as pd
import numpy as np
new = {'name': ['Sheldon', 'Penny', 'Amy', 'Bernadette', 'Raj', 'Howard'],
'episodes': [42, 24, 31, 29, 37, 40],
'gender': ['male', 'female', 'female', 'female', 'male', 'male']}
old = {'name': ['Sheldon', 'Penny', 'Amy', 'Bernadette', 'Raj', 'Howard'],
'episodes': [12, 32, 31, 32, 37, 40],
'gender': ['male', 'female', 'female', 'female', 'male', 'male']}
df1 = pd.DataFrame(new, columns = ['name','episodes', 'gender'])
df = pd.DataFrame(old, columns = ['name','episodes', 'gender'])
while True:
df1 = pd.DataFrame(new, columns = ['name','episodes', 'gender'])
print(df[~df.episodes.eq(df1.episodes)])
df1 = df
</code></pre>
<p>我需要在<code>while</code>循环中写入条件,其中<code>df[~df.episodes.eq(df1.episodes)]</code>仅在检测到更改时才打印。打印新数据后,它会将两个数据框设置为相同的值(因为不再需要旧数据),并重新检查更改。上述代码将打印:</p>
<pre><code>Columns: [name, episodes, gender]
Index: []
Empty DataFrame
Columns: [name, episodes, gender]
Index: []
Empty DataFrame
Columns: [name, episodes, gender]
Index: []
Empty DataFrame
</code></pre>
<p>因此,如果实际打印了更改,则可能会遗漏。你能建议一种更有效的方法来完成这项工作吗</p>
<p>==编辑==</p>
<p>根据@BENY的回答,如果我这样做:</p>
<pre><code>import pandas as pd
import numpy as np
new = {'name': ['Sheldon', 'Penny', 'Amy', 'Bernadette', 'Raj', 'Sheldon'],
'episodes': [42, 24, 31, 29, 37, 40],
'gender': ['male', 'female', 'female', 'female', 'male', 'male']}
old = {'name': ['Sheldon', 'Penny', 'Amy', 'Bernadette', 'Raj', 'Sheldon'],
'episodes': [12, 32, 31, 32, 37, 40],
'gender': ['male', 'female', 'female', 'female', 'male', 'male']}
df1 = pd.DataFrame(new, columns = ['name','episodes', 'gender'])
df = pd.DataFrame(old, columns = ['name','episodes', 'gender'])
while True:
df1 = pd.DataFrame(new, columns = ['name','episodes', 'gender'])
out = df.merge(df1[['name','episodes']],on=['name','episodes'],how='left',indicator=True).loc[lambda x : x['_merge']=='left_only']
print(out)
df = df1
</code></pre>
<p>它会在整个whileloop过程中打印出来:</p>
<pre><code> name episodes gender _merge
0 Sheldon 12 male left_only
1 Penny 32 female left_only
3 Bernadette 32 female left_only
name episodes gender _merge
0 Sheldon 12 male left_only
1 Penny 32 female left_only
3 Bernadette 32 female left_only
name episodes gender _merge
0 Sheldon 12 male left_only
1 Penny 32 female left_only
3 Bernadette 32 female left_only
</code></pre>
<p>有没有办法只打印一次。直到有另一个变化。如果i<code>df= df1</code>,它将按如下方式打印,我将错过更改:</p>
<pre><code>Columns: [name, episodes, gender, _merge]
Index: []
Empty DataFrame
Columns: [name, episodes, gender, _merge]
</code></pre>
<p>我需要在检测到更改的地方干净地获取这些数据</p>