熊猫去重问题

2024-10-01 11:30:35 发布

男 | 程序猿一只，喜欢编程写python代码。

我希望通过指定特定列来删除重复条目。列被标记为“sent\u name”

print(new_df)

                                  sent_name  \
0            Abbey Road Station, London, UK   
1            Abbey Wood Station, London, UK   
2                 Acton Station, London, UK   
3         Acton Central Station, London, UK 


                                                Name        Lat       Lng  \
0                            Abbey Road, London E15, UK  51.531930  0.003760   
1                            Abbey Wood, London SE2, UK  51.491060  0.121420   
2     Station Parade, West Acton London Underground ...  51.518055 -0.281053   
3                          Acton Central, London W3, UK  51.508720 -0.262950   

                                                   type  
0     [u'transit_station', u'point_of_interest', u'e...  
1     [u'transit_station', u'point_of_interest', u'e...  
2     [u'train_station', u'transit_station', u'point...  
3     [u'transit_station', u'point_of_interest', u'e...

我试过了

new_df.drop_duplicates(["sent_name"])

以及

   new_df.drop_duplicates(subset="sent_name")

在检查时，所有这些都会删除所有的副本。你知道吗

例如

1038           Woodford Station, London, UK   
1040           Woodford Station, London, UK   
1041           Woodford Station, London, UK   
1043           Woodford Station, London, UK   
1044           Woodford Station, London, UK
1038  South Woodford London Underground Station, Geo...  51.591789  0.027315   
1040  Woodford, Woodford, Woodford Green, Greater Lo...  51.606900  0.034000   
1041                     South Woodford, London E18, UK  51.591910  0.027360   
1043            South Woodford (Stop C), London E18, UK  51.591312  0.029013   
1044            South Woodford (Stop D), London E18, UK  51.592010  0.027658  
1038  [u'train_station', u'transit_station', u'point...  
1040  [u'transit_station', u'point_of_interest', u'e...  
1041  [u'transit_station', u'point_of_interest', u'e...  
1043  [u'transit_station', u'point_of_interest', u'e...  
1044  [u'transit_station', u'point_of_interest', u'e...

Tags： of name new sent point london station south

1条回答

网友

1楼 · 发布于 2024-10-01 11:30:35

您需要将^{}的结果指定为默认值inplace=False，并且几乎所有操作都返回一个副本。你知道吗

所以要么：

new_df = new_df.drop_duplicates(["sent_name"])

或者

new_df.drop_duplicates(["sent_name"], inplace=True)

会有用的

熊猫去重问题

相关问题更多 >

编程相关推荐

热门问题

热门文章

熊猫去重问题

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >