仅在组中删除重复项问题的回答

仅在组中删除重复项

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我只想从数据帧中删除特定子集中的重复项。在“A”列中的每个“spec”下，我想删除重复项，但我想在整个数据帧中保留重复项（第一个“spec”下可能有一些行与第二个“spec”下的行相同，但在“spec”下，直到下一个“spec”下我想删除重复项） 这是数据帧 测向 <pre><code> A B C spec first second test text1 text2 act text12 text13 act text14 text15 test text32 text33 act text34 text35 test text85 text86 act text87 text88 test text1 text2 act text12 text13 act text14 text15 test text85 text86 act text87 text88 spec third fourth test text1 text2 act text12 text13 act text14 text15 test text85 text86 act text87 text88 test text1 text2 act text12 text13 act text14 text15 test text85 text86 act text87 text88 </code></pre> 这就是我想要的： 测向 <pre><code> A B C spec first second test text1 text2 act text12 text13 act text14 text15 test text32 text33 act text34 text35 test text85 text86 act text87 text88 spec third fourth test text1 text2 act text12 text13 act text14 text15 test text85 text86 act text87 text88 </code></pre> 我可以将数据帧拆分为“小”数据帧，然后在for-loop中删除每个“小”数据帧的副本，最后将它们连接起来，但我想知道是否还有其他解决方案。你知道吗 我也试过，成功了： <pre><code>dfList = df.index[df["A"] == "spec"].tolist() dfList = np.asarray(dfList) for dfL in dfList: idx = np.where(dfList == dfL) if idx[0][0]!=(len(dfList)-1): df.loc[dfList[idx[0][0]]:dfList[idx[0][0]+1]-1] = df.loc[dfList[idx[0][0]]:dfList[idx[0][0]+1]-1].drop_duplicates() else: df.loc[dfList[idx[0][0]]:] = df.loc[dfList[idx[0][0]]:].drop_duplicates() </code></pre> 编辑：我必须在结尾加上： <blockquote> df.dropna(how='all', inplace=True) </blockquote> 但我只是想知道有没有别的解决办法。你知道吗

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

仅在组中删除重复项

1 个回答

相关Python问题