我有一个如下所示的数据帧:
df1 = pd.DataFrame({
"Business_Process_Activity" : ["SendingReportToManager", "SendingReportToManager", "SendingReportToManager", "SendingReportToManager", "SendingReportToManager", "PreparingAndSendingAgenda", "PreparingAndSendingAgenda"],
"Case":[1,1,2,2,2,3,4],
"Application":["MicrosoftWord", "MicrosoftOutlook", "MicrosoftWord", "MicrosoftOutlook", "MicrosoftOutlook", "MicrosoftWord", "MicrosoftWord"],
"Activity_of_the_User":["SavingADocument", "SendingAnEmail", "SavingADocument", "SendingAnEmail", "SendingAnEmail", "SavingADocument", "SavingADocument"],
"Receiver_email_root":["None", "idatta91 adarandall larryjacob", "None", "idatta91 larryjacob"," vanessaHudgens prithakaur", "None", "None"],
"Receiever_email_domains":["None", "gmail yahoo", "None", "gmail", "gmail yahoo", "None", "None"],
"Receiver_email_count_Catg":["None", "Few", "None", "Double", "Double", "None", "None"],
"Subject":["None","Activity Report", "None", "Project Progress Report", "Project Progress Report 2", "None", "None"]
})
我想根据Case
列合并数据帧的行。因此,如果两行或更多行的Case
列中的数字相同,则这些行的其他列的字符串将连接到一行中
同样,对于相同数量的情况,Business_Process_Activity
列中的值也是相同的。对于该列,我不想连接Business_Process_Activity
值,但只保留其中一个值,因为该列需要是分类的。我希望最终的数据帧如下所示:
df2 = pd.DataFrame({"Case":[1,2,3,4],
"Business_Process_Activity" : ["SendingReportToManager", "SendingReportToManager", "PreparingAndSendingAgenda", "PreparingAndSendingAgenda"],
"Application":["MicrosoftWord MicrosoftOutlook", "MicrosoftWord MicrosoftOutlook MicrosoftOutlook", "MicrosoftWord", "MicrosoftWord"],
"Activity_of_the_User":["SavingADocument SendingAnEmail","SavingADocument SendingAnEmail SendingAnEmail", "SavingADocument", "SavingADocument"],
"Receiver_email_root":["idatta91 adarandall larryjacob", "idatta91 larryjacob vanessaHudgens prithakaur", "None", "None"],
"Receiever_email_domains":["gmail yahoo","gmail gmail yahoo", "None", "None"],
"Receiver_email_count_Catg":["Few", "Double Double", "None", "None"],
"Subject":["Activity Report", "Project Progress Report Project Progress Report 2", "None", "None"]
})
如果字符串与“None”列合并,则应删除“None”字符串,因为该值不再为空。当行合并为一行时,应删除案例列的重复编号
我该怎么做?提前谢谢
想法是删除每个组的
None
值和None
字符串,将它们连接在一起,最后将空字符串替换为None
:另一个具有自定义功能的解决方案:
使用:
相关问题 更多 >
编程相关推荐