如何使用XBRLParser解析XBRL字符串（不是xml文件）

from xbrl import XBRLParser import base64 decoded = base64.decodebytes(data[0].text.encode()) ---> #decoded has the XBRL content # data = decoded.find('xbrl') # dom = xml.dom.minidom.parseString(decoded) # or xml.dom.minidom.parseString(xml_string) # pretty_xml_as_string = dom.toprettyxml() # print(pretty_xml_as_string) xbrl_parser = XBRLParser() xbrl = xbrl_parser.parse(decoded) #---> File "/Users/~/Downloads/venv/lib/python3.7/site-packages/xbrl/xbrl.py", line 64, in parse #file_handler = open(file_handle)

1条回答

网友

1楼 · 发布于 2024-10-03 17:28:49

在使用xbrl.py（xbrl库的源代码）之后，我提出了以下解决方案：

在xbrl.py的parse函数中，我对打开xml文件并读取该文件的行进行了注释，然后将其传递到XBRLPreprocessedFile函数中。现在，它直接将parse参数传递给XBRLProcestedFile，而不打开它。在XBRLProcestedFile函数中，我将xbrl_string = self.fh.read()行更改为xbrl_string = self.fh，因为我发送的是字符串而不是文件

在我的代码中，我创建了在xbrl.py中创建的Custome类，并将decoded.decode('utf-8)传递给解析

我的代码：

class Custom(object):

    def __init__(self):
        return None

    def __call__(self):
        return self.__dict__.items()

from xbrl import XBRLParser
import base64
decoded = base64.decodebytes(data[0].text.encode())
xbrl_parser = XBRLParser()
xbrl = xbrl_parser.parse(decoded.decode("utf-8"))
# *** here I find all the tags
custom_obj = Custom()
custom_data = xbrl.find_all(re.compile('^((?!(us-gaap|dei|xbrll|xbrldi)).)*:\s*',
                                       re.IGNORECASE | re.MULTILINE))

xbrl.py中的解析函数：

def parse(self, file_handle):
    """
    parse is the main entry point for an XBRLParser. It takes a file
    handle. "*** which now takes a string ***"
    """

    xbrl_obj = XBRL()

    # if no file handle was given create our own
    """if not hasattr(file_handle, 'read'):
        file_handler = open(file_handle)
    else:
        file_handler = file_handle"""

    # Store the headers
    xbrl_file = XBRLPreprocessedFile(file_handle)

    xbrl = soup_maker(xbrl_file.fh)
    # file_handler.close()
    xbrl_base = xbrl.find(name=re.compile("xbrl*:*"))

    if xbrl.find('xbrl') is None and xbrl_base is None:
        raise XBRLParserException('The xbrl file is empty!')

    # lookahead to see if we need a custom leading element
    lookahead = xbrl.find(name=re.compile("context",
                          re.IGNORECASE | re.MULTILINE)).name
    if ":" in lookahead:
        self.xbrl_base = lookahead.split(":")[0] + ":"
    else:
        self.xbrl_base = ""

    return xbrl

xbrl.py中xbrl预处理文件函数的更改：

xbrl_string = self.fh.read()  > xbrl_string = self.fh

相关问题更多 >

编程相关推荐

热门问题

热门文章