在时间戳上连接两个数据帧

2024-06-14 08:22:14 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个数据帧,在列中相同。每个都有一个时间戳列。一个数据帧具有来自用户A的文本数据,另一个数据帧具有来自用户B的文本数据。当用户A在讲话时,用户B不在讲话,因此数据从不重叠。我想把它们合并成一个按时间戳组织的数据帧

df_a start stop words 0 2.1 i know honey but what happened we got a job 3.7 6.4 no know but thats a different kind of help but 8.2 11.5 because people that are supposed to be 12.9 15.4 yeah but where else can you go to get one df_b start stop words 2.2 3.6 but he never said 6.5 8.2 but what? 11.6 12.8 i dont think thats true 15.5 19.2 anywhere i dont know desired_output start stop words 0 2.1 i know honey but what happened we got a job 2.2 3.6 but he never said 3.7 6.4 no know but thats a different kind of help but 6.5 8.2 but what? 8.2 11.5 because people that are supposed to be 11.6 12.8 i dont think thats true 12.9 15.4 yeah but where else can you go to get one 15.5 19.2 anywhere i dont know

Tags: to数据用户文本df时间whatstart
2条回答

这应该做到:

df = df_a.append(df_b).sort_values(by=['start'])

我会使用pd.concat,因为操作感觉更像是连接而不是连接:

output = pd.concat([df_a,df_b]).sort_values(['start'])
print(output)
   start  stop                                           words
0    0.0   2.1     i know honey but what happened we got a job
0    2.2   3.6                               but he never said
1    3.7   6.4  no know but thats a different kind of help but
1    6.5   8.2                                       but what?
2    8.2  11.5          because people that are supposed to be
2   11.6  12.8                         i dont think thats true
3   12.9  15.4       yeah but where else can you go to get one
3   15.5  19.2                            anywhere i dont know

相关问题 更多 >