Pandas0.22数据帧.drop排得太多了

1条回答

网友

1楼 · 发布于 2024-10-04 05:20:22

默认情况下pd.concat不会重置索引，因此，如果在testdf和datadf中都存在索引，则在对这些索引进行采样时，它们将同时被丢弃。你知道吗

drop将删除所有重复的索引，因此从testdf和datadf中的索引中丢失更多的行。你知道吗

潜在的解决方案正在从df = pd.concat([testdf,datadf])变为

df = pd.concat([testdf,datadf]).reset_index()

或者

df = pd.concat([testdf,datadf], ignore_index=True)

问题重现：

df = pd.DataFrame({'a': {0: 0.6987303529918656,
  1: -1.4637804486869905,
  2: 0.4512092453413682,
  3: 0.03898323021771516,
  4: -0.143758037238284,
  5: -1.6277278110578157}})

df_combined = pd.concat([df, df])
print(df_combined)
print(df_combined.shape)
sample = df_combined.sample(frac=0.5)
print(sample.shape)
df_combined.drop(sample.index).shape

          a
0  0.698730
1 -1.463780
2  0.451209
3  0.038983
4 -0.143758
5 -1.627728
0  0.698730
1 -1.463780
2  0.451209
3  0.038983
4 -0.143758
5 -1.627728
(12, 1) # print(df_combined.shape)
(6, 1)  # print(sample.shape)
Out[37]:
(4, 1)  # df_combined.drop(sample.index).shape

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas0.22数据帧.drop排得太多了

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >