继承pandas.DataFrame时,super()不是第一个时Python静默退出

2024-10-03 19:32:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我想用附加属性扩展pandas DataFrame,因此我正在编写类(简化):

使用python 3.8

import pandas as pd

class ExtendedDF(pd.DataFrame):
    def __init__(self, df: pd.DataFrame):
        super().__init__(df)
        self.title = 'some dataframe'

这样行

问题是如果我想使super()不按如下顺序排在第一位:

class ExtendedDF(pd.DataFrame):
    def __init__(self, df: pd.DataFrame):
        self.title = 'some dataframe'
        super().__init__(df)
        

然后在IPython REPL中:

(.venv) C:\project\ ipython -i script.py

[1]: somedf = pd.DataFrame(data)

[2]: extdf = ExtendedDF(df=somedf)

(.venv) C:\project\

它没有任何作用。只有在没有错误的情况下退出

我已尝试记录以下内容:

import logging
logging.basicConfig(filename='D:/MyLog.log',level=logging.DEBUG)

class ExtendedDF(pd.DataFrame):
    def __init__(self, df: pd.DataFrame):
        try:
            self.title = 'some dataframe'
            super().__init__(df)
        except Exception as e:
            logging.info(e)

但是日志是空的

编辑:

常规python REPL提供以下堆栈跟踪:

Traceback (most recent call last):
  File "m.py", line 30, in <module>
    extdf = ExtendedDF(df=somedf)
  File "m.py", line 26, in __init__
    self.title = 'some dataframe'
  File "C:\project\.venvw\lib\site-packages\pandas\core\generic.py", line 5166, in __setattr__
    existing = getattr(self, name)
  File "C:\project\.venvw\lib\site-packages\pandas\core\generic.py", line 5139, in __getattr__
    if self._info_axis._can_hold_identifiers_and_holds_name(name):
  File "C:\project\.venvw\lib\site-packages\pandas\core\generic.py", line 5139, in __getattr__
    if self._info_axis._can_hold_identifiers_and_holds_name(name):
  File "C:\project\.venvw\lib\site-packages\pandas\core\generic.py", line 5139, in __getattr__
    if self._info_axis._can_hold_identifiers_and_holds_name(name):
  [Previous line repeated 987 more times]
  File "C:\project\.venvw\lib\site-packages\pandas\core\generic.py", line 449, in _info_axis
    return getattr(self, self._info_axis_name)
  File "C:\project\.venvw\lib\site-packages\pandas\core\generic.py", line 5137, in __getattr__
    return object.__getattribute__(self, name)
  File "pandas\_libs\properties.pyx", line 62, in pandas._libs.properties.AxisProperty.__get__
  File "C:\project\.venvw\lib\site-packages\pandas\core\generic.py", line 5137, in __getattr__
    return object.__getattribute__(self, name)
RecursionError: maximum recursion depth exceeded while calling a Python object

为什么不让我把super()方法排在第二位


Tags: nameinpyselfprojectdataframepandasdf
1条回答
网友
1楼 · 发布于 2024-10-03 19:32:31

Pandas正在尝试确定self.title是否是数据帧的一列,因为如果self.title是一列,它需要以不同的方式处理赋值

因为您没有调用超类构造函数,所以对象处于不安全状态,Pandas的属性处理逻辑从来没有被设计来处理过。Pandas进入无限递归,试图找出这个未初始化的对象有哪些列

这就是为什么许多语言强制超类构造函数调用成为子类构造函数中发生的第一件事的原因。在超类构造函数完成之前,对象的超类部分还没有准备好。对象处于不安全状态,无法开始子类初始化

相关问题 更多 >