如何使用XBRLParser解析XBRL字符串(不是xml文件)

2024-10-03 17:28:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含XBRL内容的get请求返回的字符串,如何使用XBRPARSER解析它? 代码如下:

from xbrl import XBRLParser
import base64
decoded = base64.decodebytes(data[0].text.encode()) ---> #decoded has the XBRL content
# data = decoded.find('xbrl')
# dom = xml.dom.minidom.parseString(decoded)  # or xml.dom.minidom.parseString(xml_string)
# pretty_xml_as_string = dom.toprettyxml()

# print(pretty_xml_as_string)
xbrl_parser = XBRLParser()
xbrl = xbrl_parser.parse(decoded) #---> File "/Users/~/Downloads/venv/lib/python3.7/site-packages/xbrl/xbrl.py", line 64, in parse
                                       #file_handler = open(file_handle)

我放置DOM是为了显示它有一个parseString,这是我需要的,但对于XBRLParser


Tags: importparserdatastringasprettyxmldom
1条回答
网友
1楼 · 发布于 2024-10-03 17:28:49

在使用xbrl.py(xbrl库的源代码)之后,我提出了以下解决方案:

在xbrl.py的parse函数中,我对打开xml文件并读取该文件的行进行了注释,然后将其传递到XBRLPreprocessedFile函数中。现在,它直接将parse参数传递给XBRLProcestedFile,而不打开它。在XBRLProcestedFile函数中,我将xbrl_string = self.fh.read()行更改为xbrl_string = self.fh,因为我发送的是字符串而不是文件

在我的代码中,我创建了在xbrl.py中创建的Custome类,并将decoded.decode('utf-8)传递给解析

我的代码:

class Custom(object):

    def __init__(self):
        return None

    def __call__(self):
        return self.__dict__.items()

from xbrl import XBRLParser
import base64
decoded = base64.decodebytes(data[0].text.encode())
xbrl_parser = XBRLParser()
xbrl = xbrl_parser.parse(decoded.decode("utf-8"))
# *** here I find all the tags
custom_obj = Custom()
custom_data = xbrl.find_all(re.compile('^((?!(us-gaap|dei|xbrll|xbrldi)).)*:\s*',
                                       re.IGNORECASE | re.MULTILINE))

xbrl.py中的解析函数:

def parse(self, file_handle):
    """
    parse is the main entry point for an XBRLParser. It takes a file
    handle. "*** which now takes a string ***"
    """

    xbrl_obj = XBRL()

    # if no file handle was given create our own
    """if not hasattr(file_handle, 'read'):
        file_handler = open(file_handle)
    else:
        file_handler = file_handle"""

    # Store the headers
    xbrl_file = XBRLPreprocessedFile(file_handle)

    xbrl = soup_maker(xbrl_file.fh)
    # file_handler.close()
    xbrl_base = xbrl.find(name=re.compile("xbrl*:*"))

    if xbrl.find('xbrl') is None and xbrl_base is None:
        raise XBRLParserException('The xbrl file is empty!')

    # lookahead to see if we need a custom leading element
    lookahead = xbrl.find(name=re.compile("context",
                          re.IGNORECASE | re.MULTILINE)).name
    if ":" in lookahead:
        self.xbrl_base = lookahead.split(":")[0] + ":"
    else:
        self.xbrl_base = ""

    return xbrl

xbrl.py中xbrl预处理文件函数的更改:

xbrl_string = self.fh.read()  > xbrl_string = self.fh

相关问题 更多 >