如果Pandas数据帧有两个同名列,np.where似乎会中断。。。这是预期的行为吗?

2024-10-01 19:20:24 发布

您现在位置:Python中文网/ 问答频道 /正文

下面是一个有效的示例:

df = pd.DataFrame(np.random.randn(10,3),columns=['A','B','C'])
df['D'] = np.nan

for i in range(df.shape[1]-1):
    df['D'] = np.where(df.iloc[:,i] == df.min(axis=1),
                       df.iloc[:,i].shift(-1),
                       df['D'])
df

以下是中断的示例(将B列更改为A列):

df = pd.DataFrame(np.random.randn(10,3),columns=['A','A','C'])
df['D'] = np.nan

for i in range(df.shape[1]-1):
    df['D'] = np.where(df.iloc[:,i] == df.min(axis=1),
                       df.iloc[:,i].shift(-1),
                       df['D'])
df

我的问题:这是np.where的预期行为吗?即使A列和B列的标签都没有调用它们?还是我犯了什么错误

编辑:这是一条很长的错误消息(我正在使用Jupyter)

ValueError                                Traceback (most recent call last)
<ipython-input-184-2ac8fa635230> in <module>
      5     df['D'] = np.where(df.iloc[:,i] == df.min(axis=1),
      6                        df.iloc[:,i].shift(-1),
----> 7                        df['D'])
      8 df

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value)
   2936         else:
   2937             # set column
-> 2938             self._set_item(key, value)
   2939 
   2940     def _setitem_slice(self, key, value):

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in _set_item(self, key, value)
   2999         self._ensure_valid_index(value)
   3000         value = self._sanitize_column(key, value)
-> 3001         NDFrame._set_item(self, key, value)
   3002 
   3003         # check if we are modifying a copy

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in _set_item(self, key, value)
   3622 
   3623     def _set_item(self, key, value) -> None:
-> 3624         self._data.set(key, value)
   3625         self._clear_item_cache()
   3626 

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals\managers.py in set(self, item, value)
   1084         unfit_val_locs = []
   1085         removed_blknos = []
-> 1086         for blkno, val_locs in libinternals.get_blkno_placements(blknos, group=True):
   1087             blk = self.blocks[blkno]
   1088             blk_locs = blklocs[val_locs.indexer]

pandas\_libs\internals.pyx in get_blkno_placements()

pandas\_libs\internals.pyx in pandas._libs.internals.get_blkno_indexers()

ValueError: Buffer has wrong number of dimensions (expected 1, got 0)

Tags: keyinselfpandasdfvaluelibnp
2条回答

You cannot have two columns with same name如果您有一个类似于合并a列和C列的场景,那么应该合并哪一列将给出一个错误

这很奇怪,我不明白,但我注意到np.where与错误无关。下面的代码有相同的错误。必须与在for循环中指定列有关吗?因为在循环之外没有问题

df = pd.DataFrame(np.random.randn(10,3),columns=['A','A','C'])

df['D'] = np.random.randn(10,1)

for i in range(df.shape[1]-1):
    df['D'] = np.random.randn(10,1) #Breaks
    

相关问题 更多 >

    热门问题