我希望通过指定特定列来删除重复条目。 列被标记为“sent\u name”
print(new_df)
sent_name \
0 Abbey Road Station, London, UK
1 Abbey Wood Station, London, UK
2 Acton Station, London, UK
3 Acton Central Station, London, UK
Name Lat Lng \
0 Abbey Road, London E15, UK 51.531930 0.003760
1 Abbey Wood, London SE2, UK 51.491060 0.121420
2 Station Parade, West Acton London Underground ... 51.518055 -0.281053
3 Acton Central, London W3, UK 51.508720 -0.262950
type
0 [u'transit_station', u'point_of_interest', u'e...
1 [u'transit_station', u'point_of_interest', u'e...
2 [u'train_station', u'transit_station', u'point...
3 [u'transit_station', u'point_of_interest', u'e...
我试过了
new_df.drop_duplicates(["sent_name"])
以及
new_df.drop_duplicates(subset="sent_name")
在检查时,所有这些都会删除所有的副本。你知道吗
例如
1038 Woodford Station, London, UK
1040 Woodford Station, London, UK
1041 Woodford Station, London, UK
1043 Woodford Station, London, UK
1044 Woodford Station, London, UK
1038 South Woodford London Underground Station, Geo... 51.591789 0.027315
1040 Woodford, Woodford, Woodford Green, Greater Lo... 51.606900 0.034000
1041 South Woodford, London E18, UK 51.591910 0.027360
1043 South Woodford (Stop C), London E18, UK 51.591312 0.029013
1044 South Woodford (Stop D), London E18, UK 51.592010 0.027658
1038 [u'train_station', u'transit_station', u'point...
1040 [u'transit_station', u'point_of_interest', u'e...
1041 [u'transit_station', u'point_of_interest', u'e...
1043 [u'transit_station', u'point_of_interest', u'e...
1044 [u'transit_station', u'point_of_interest', u'e...
您需要将^{} 的结果指定为默认值
inplace=False
,并且几乎所有操作都返回一个副本。你知道吗所以要么:
或者
会有用的
相关问题 更多 >
编程相关推荐