如何在.txt文件中的JSON对象之间添加逗号,然后在Python中将其转换为JSON数组

2024-10-01 00:15:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在阅读一个txt文件,其中包含JSON对象,其中对象没有逗号分隔。我想在json对象之间添加逗号,并将它们全部放入json列表或数组中。在

我试过了JSON.loads但是我得到了JSON解码错误。所以我意识到我应该在.txt文件中的不同对象之间加上逗号

下面是.txt中文件内容的示例

{
    "@mdate": "2011-01-11",
    "@key": "journals/acta/Saxena96",
    "author": {
        "ftail": "\n",
        "ftext": "Sanjeev Saxena"
    },
    "title": {
        "ftail": "\n",
        "ftext": "Parallel Integer Sorting and Simulation Amongst CRCW Models."
    },
    "pages": {
        "ftail": "\n",
        "ftext": "607-619"
    },
    "year": {
        "ftail": "\n",
        "ftext": "1996"
    },
    "volume": {
        "ftail": "\n",
        "ftext": "33"
    },
    "journal": {
        "ftail": "\n",
        "ftext": "Acta Inf."
    },
    "number": {
        "ftail": "\n",
        "ftext": "7"
    },
    "url": {
        "ftail": "\n",
        "ftext": "db/journals/acta/acta33.htmlfSaxena96"
    },
    "ee": {
        "ftail": "\n",
        "ftext": "http://dx.doi.org/10.1007/BF03036466"
    },
    "ftail": "\n",
    "ftext": "\n"
}{
    "@mdate": "2011-01-11",
    "@key": "journals/acta/Simon83",
    "author": {
        "ftail": "\n",
        "ftext": "Hans-Ulrich Simon"
    },
    "title": {
        "ftail": "\n",
        "ftext": "Pattern Matching in Trees and Nets."
    },
    "pages": {
        "ftail": "\n",
        "ftext": "227-248"
    },
    "year": {
        "ftail": "\n",
        "ftext": "1983"
    },
    "volume": {
        "ftail": "\n",
        "ftext": "20"
    },
    "journal": {
        "ftail": "\n",
        "ftext": "Acta Inf."
    },
    "url": {
        "ftail": "\n",
        "ftext": "db/journals/acta/acta20.htmlfSimon83"
    },
    "ee": {
        "ftail": "\n",
        "ftext": "http://dx.doi.org/10.1007/BF01257084"
    },
    "ftail": "\n",
    "ftext": "\n"
}

好的,好的

预期结果:

好的,好的

^{pr2}$

啊,啊,啊,啊


Tags: and文件对象keytxtjsontitleauthor
2条回答

如果您始终可以保证您的JSON格式如您的示例所示,即新的JSON对象从最后一个结束的同一行开始,并且没有缩进,那么您只需将JSON读入缓冲区,直到遇到这样的行,然后发送缓冲区进行JSON解析—冲洗和重复:

import json

parsed = []  # a list to hold individually parsed JSON objects
with open('path/to/your.json') as f:
    buffer = ''
    for line in f:
        if line[0] == '}':  # end of the current JSON object
            parsed.append(json.loads(buffer + '}'))
            buffer = line[1:]
        else:
            buffer += line

print(json.dumps(parsed, indent=2))  # just to make sure it all went well

会产生:

^{pr2}$

如果您的案例没有那么明确(例如,您无法预测格式),您可以尝试一些迭代/基于事件的JSON解析器(例如^{}),这些解析器能够在“根”对象关闭后告诉您,以便您可以将解析的JSON对象“拆分”为一个序列。在

更新:再想一想,除了内置的json模块之外,您不需要任何东西,即使连接的JSON没有正确或缩进,您可以使用^{}(及其未记录的第二个参数)来遍历数据并以迭代方式寻找有效的JSON结构,直到遍历完整个文件(或遇到错误)。例如:

import json

parser = json.JSONDecoder()
parsed = []  # a list to hold individually parsed JSON structures
with open('test.json') as f:
    data = f.read()
head = 0  # hold the current position as we parse
while True:
    head = (data.find('{', head) + 1 or data.find('[', head) + 1) - 1
    try:
        struct, head = parser.raw_decode(data, head)
        parsed.append(struct)
    except (ValueError, json.JSONDecodeError):  # no more valid JSON structures
        break

print(json.dumps(parsed, indent=2))  # make sure it all went well

应该会得到与上面相同的结果,但这次不会依赖于当JSON对象“关闭”时}是新行的第一个字符。它还应该适用于连续堆叠的JSON数组。在

可以使用reqexp在对象之间添加逗号:

import re

with open('name.txt', 'r') as input, open('out.txt', 'w') as output:
    output.write("[\n")
    for line in input:
        line = re.sub('}{', '},{', line)
        output.write('    '+line)
    output.write("]\n")

相关问题 更多 >