我有两个不同的数据帧:A,B。column事件有相似的数据,我用来比较这两个数据帧。 我想给Dataframe一个新列dfA.newContext#
为此,我需要使用事件列。 我想迭代数据帧A以找到事件的匹配项,并将dfB.context分配给dfA.newContext
我认为循环是最好的方式,因为我有一些条件需要检查
这可能要求有点高,但我真的被卡住了。。 我想这样做:
offset = 0
Iterate through dfA:
extract event
extract context#
Iterate through dfB:
if dfB.event == dfA.event:
dfA.newContext# = dfB.context#
offset = dfA.new_context# - dfA.context#
if dfB.event == "Special":
dfA.newContext# = dfA.context# - offset
数据帧A
+-------------+---------+------+
|dfA.context# |dfA.event| Name |
+-------------+---------+------+
| 0 | Special | Bob |
| 2 | Special | Joan |
| 4 | Bird | Susie|
| 5 | Special | Alice|
| 6 | Special | Tom |
| 7 | Special | Luis |
| 8 | Parrot | Jill |
| 9 | Special | Reed |
| 10 | Special | Lucas|
| 11 | Snake | Kat |
| 12 | Special | Bill |
| 13 | Special | Leo |
| 14 | Special | Peter|
| 15 | Special | Mark |
| 16 | Special | Joe |
| 17 | Special | Lora |
| 18 | Special | Care |
| 19 |Elephant | David|
| 20 | Special | Ann |
| 21 | Special | Larry|
| 22 | Skunk | Tony |
+-------------+---------+------+
数据帧B
+-------------+---------+
|dfB.context# |dfB.event|
+-------------+---------+
| 0 | Special |
| 0 | Special |
| 0 | Special |
| 1 | Special |
| 1 | Special |
| 1 | Special |
| 1 | Special |
| 2 | Bird |
| 2 | Bird |
| 3 | Special |
| 6 | Parrot |
| 6 | Parrot |
| 6 | Parrot |
| 6 | Parrot |
| 7 | Special |
| 7 | Special |
| 9 | Snake |
| 9 | Snake |
| 9 | Snake |
| 10 | Special |
| 17 |Elephant |
| 17 |Elephant |
| 17 |Elephant |
| 18 | Special |
| 18 | Special |
| 20 | Skunk |
| 20 | Skunk |
| 21 | Special |
| 26 | Antelope|
+-------------+---------+
期望测向
+-------------+---------+------+-------------+
|dfA.context# |dfA.event| Name |dfA.newContext#|
+-------------+---------+------+-------------+
| 0 | Special | Bob | 0 |
| 2 | Special | Joan | 1 |
| 4 | Bird | Susie| 2 |
| 5 | Special | Alice| 3 |
| 6 | Special | Tom | |
| 7 | Special | Luis | |
| 8 | Parrot | Jill | 6 |
| 9 | Special | Reed | 7 |
| 10 | Special | Lucas| |
| 11 | Snake | Kat | 9 |
| 12 | Special | Bill | 10 |
| 13 | Special | Leo | |
| 14 | Special | Peter| |
| 15 | Special | Mark | |
| 16 | Special | Joe | |
| 17 | Special | Lora | |
| 18 | Special | Care | |
| 19 |Elephant | David| 17 |
| 20 | Special | Ann | 18 |
| 21 | Special | Larry| |
| 22 | Skunk | Tony | 20 |
+-------------+---------+------+-------------+
如何一次迭代两个数据帧并访问信息
95%的时间可以使用矢量化方法,消除循环的需要。在这种情况下,您可以使用
pd.merge
作为长循环的简单、干净和高效的替代方案编辑:(答案#1):实际上,您可以使用
left_on=dfA.index, right_on='context'
执行更高级的合并,并在合并后的一行中与其他清理操作一起执行此操作,但请参阅下面更完整的答案,它采用了类似的方法:回答#2: 在操作两个数据帧以准备合并后,您可以将两个数据帧合并在一起:
dfA
-使dfA
中的context
列等于index
,但在更改它之前,请将其另存为序列s
,以备以后使用dfB
-在准备合并时,删除重复项,重置索引,并将索引的名称更改为newContext
李>event
和context
并将newContext
值替换为context
值,其中为null李>df['context'] = s
将context
更改回其原始数据相关问题 更多 >
编程相关推荐