在Python3中为DataFrame创建列名的问题

1条回答

网友

1楼 · 发布于 2024-09-28 20:51:04

交替构造调用

您可以使用基础NumPy数组：

u = pd.DataFrame(z.values, columns=['hello','ajsajs'])

  hello ajsajs
0  abcb  asasa
1  sdsd   aeio

或者，您可以使用：

u = z.rename(columns={0: 'hello',1: 'ajsajs'})

最后，正如@Dark所建议的：

u = z.set_axis(['hello','ajsajs'], axis=1, inplace=False)

关于set_axis中inplace的一个小注-

WARNING: inplace=None currently falls back to to True, but in a future version, will default to False. Use inplace=True explicitly rather than relying on the default.

在pandas 0.20.3中，语法仅为：

u = z.set_axis(axis=1, labels=['hello','ajsajs'])

@Dark的解决方案在这里出现得最快。你知道吗

为什么当前方法不起作用

我认为这里的问题是，当以这种方式构造数据帧时，会调用.reindex。以下是一些源代码，其中省略号表示我遗漏的不相关的内容：

from pandas.core.internals import BlockManager

# pandas.core.frame.DataFrame
class DataFrame(NDFrame):
    def __init__(self, data=None, index=None, columns=None, dtype=None,
                 copy=False):
        # ...
        if isinstance(data, DataFrame):
            data = data._data
        if isinstance(data, BlockManager):
            mgr = self._init_mgr(data, axes=dict(index=index, columns=columns),
                                 dtype=dtype, copy=copy)
        # ... a bunch of other if statements irrelevant to your case
        NDFrame.__init__(self, mgr, fastpath=True)
        # ...

这里发生了什么：

DataFrame继承自更通用的基类，而基类又具有多重继承。（熊猫是伟大的，但它的来源可能就像试图通过蜘蛛网回溯。）
在u = pd.DataFrame(z,columns=['hello','ajsajs'])中，x是一个数据帧。因此，下面的第一个if语句是True，data = data._data。什么是_data？是^{}.*（下面继续…）
因为我们刚刚转换了您传递给它的BlockManager的内容，所以下一个if语句的计算结果也是True。然后mgr被分配给_init_mrg方法的结果，父类的__init__被调用，传递mgr。你知道吗

*用isinstance(z._data, BlockManager)确认。你知道吗

现在进入第二部分。。。你知道吗

# pandas.core.generic.NDFrame
class NDFrame(PandasObject, SelectionMixin):
    def __init__(self, data, axes=None, copy=False, dtype=None,
             fastpath=False):
    # ...

    def _init_mgr(self, mgr, axes=None, dtype=None, copy=False):
        """ passed a manager and a axes dict """
        for a, axe in axes.items():
            if axe is not None:
                mgr = mgr.reindex_axis(axe,
                                       axis=self._get_block_manager_axis(a),
                                       copy=False)
    # ...
        return mgr

这里是定义_init_mgr的地方，上面会调用它。基本上，在你的情况下，你有：

columns=['hello','ajsajs']
axes=dict(index=None, columns=columns)
# ...

当您转到“重新索引轴”（reindex axis）并指定一个新轴（旧对象中不包含任何新标签）时，您将得到所有的NaN。这似乎是一个深思熟虑的设计决策。考虑这个相关的例子来证明这一点，其中一个新列存在，一个不存在：

pd.DataFrame(z, columns=[0, 'ajsajs'])

      0  ajsajs
0  abcb     NaN
1  sdsd     NaN

交替构造调用

为什么当前方法不起作用

相关问题更多 >

编程相关推荐

热门问题

热门文章

在Python3中为DataFrame创建列名的问题

交替构造调用

为什么当前方法不起作用

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >