我有一个这样的数据框。你知道吗
Aman Aggarwal Amar Jannela Vipin Kumar Roshan Pati
BlackBuck DJ CHETAS WOW Editions MensXP
Transport/Freight Musician/Band Furniture News/Media Website
Like Like Like Like
NaN NaN NaN NaN
GiveMeSport NaN 500 Startups No Abuse KG
News/Media Website Celina Jaitly Internet/Software Community
Like Actor/Director Like Liked
NaN Like NaN NaN
NaN NaN Jitendra Kumar Monogatari Series
Anushka Sharma Durjoy Datta Actor/Director TV Show
Actor/Director Author Liked Like
Like Like NaN NaN
NaN NaN NaN NaN
很明显,原始csv文件中有空行。我必须从中提取两个数据帧本栏name作为新datafra中每一行的第一个元素,page\u name(BlackBuck)元素作为相应行的下一个元素。像这样的。你知道吗
Aman Aggarwal BlackBuck GiveMeSport Anushka Sharma
Amar Jannela DJ CHETAS Celina Jaitly Durjoy Datta
Vipin Kumar WOW Editions 500 Startups Jitendra Kumar
Roshan Pati MensXP No Abuse KGP Monogatari Series
第二个数据帧也类似这样
Aman Aggarwal Transport/Freight News/Media Website Actor/Director
Amar Jannela Musician/Band Actor/Director Author
Vipin Kumar Furniture Internet/Software Actor/Director
Roshan Pati News/Media Website Community TV Show
真正的问题是存在任意的NaN值,有些地方bank可能也喜欢,但唯一的问题是名称(BlackBuck)和类别(Transport/Freight)是相同的一起。自从我的coe无法识别哪个是页面名称,哪个是类别。因此,我可能必须首先为每一列分别删除NaN值和Like,Like,然后相应地对齐和转置。如何在python2.7中有效地实现这一点。你知道吗
很明显,您必须逐列执行,因为名称和类别没有对齐。我使用
apply
逐列工作,并过滤掉null值或字符串列表中的值以避免:请注意,这也会起作用,但我不太相信:
输出:
注意事项
column.str.contains('Like')
之前测试column.isnull()
是很重要的,否则后者将在空值时失败。你知道吗相关问题 更多 >
编程相关推荐