如何将JSON文件的一个键/值转换为.txt文件?

2024-09-30 07:22:40 发布

您现在位置:Python中文网/ 问答频道 /正文

我有大约400000个JSON文件。对于每个JSON,我想将一个键/值提取到一个.txt文件中。(每个JSON对应一个文本文件。所有JSON文件的格式都相同。)我使用python的pandas和JSON.loads研究了各种选项,但我还没有找到一个有效的解决方案。(下面我将包含一个简短的JSON示例文件作为示例,以防有用。我想提取“纯文本”。)

    {
  "resource_uri": "https://www.courtlistener.com/api/rest/v3/opinions/9/",
  "id": 9,
  "absolute_url": "/opinion/9/pleasures-of-san-patricio-inc-v-mendez-torres/",
  "cluster": "https://www.courtlistener.com/api/rest/v3/clusters/9/",
  "author": "https://www.courtlistener.com/api/rest/v3/people/3246/",
  "joined_by": [],
  "author_str": "",
  "per_curiam": false,
  "joined_by_str": "",
  "date_created": "2010-03-13T23:42:04Z",
  "date_modified": "2020-03-03T09:02:12.634972Z",
  "type": "010combined",
  "sha1": "ce7f1af3436b00021ea3ee2f13107456b383cfa2",
  "page_count": 17,
  "download_url": "http://www.ca1.uscourts.gov/pdf.opinions/08-2388P-01A.pdf",
  "local_path": "pdf/2010/02/22/Rocafort_v._Mendez-Torres.pdf",
  "plain_text": "  United States Court of Appeals\n           For the First Circuit\n\nNo. 08-2388\n\n",
 "opinions_cited": [
    "https://www.courtlistener.com/api/rest/v3/opinions/103838/",
    "https://www.courtlistener.com/api/rest/v3/opinions/110435/",
    "https://www.courtlistener.com/api/rest/v3/opinions/110749/"
  ]
}

我想要的最终结果是每个JSON都有一个文本文件,如下所示:

United States Court of Appeals
For the First Circuit

No. 08-2388


Tags: 文件ofhttpscomrestapijsonurl
2条回答

如果您的JSON格式正确,您可以使用JSON库将JSON作为字典加载,然后JSON中的每个键都是字典中的键:

import json 
  
with open('data.json') as json_file: 
    data = json.load(json_file) 
    print(data['plain_text'])

输出为:

  United States Court of Appeals
           For the First Circuit

No. 08-2388

你也有一些额外的空间。如果要删除它们,可以使用:

import json 
  
with open('data.json') as json_file: 
    data = json.load(json_file) 
    field = data['plain_text'].split('\n')
    for l in field:
        print(l.strip())

输出为:

United States Court of Appeals
For the First Circuit

No. 08-2388

假设您的所有JSON文件都在当前目录中,并且都是正确的JSON文件,您可以一次转换所有JSON文件:

import glob
import json
import os.path

for full_name in glob.glob("*.json"):
    with open(full_name, encoding="utf-8") as json_file:
        print(full_name)
        dic = json.load(json_file)
        text = dic["plain_text"]
        name, __ = os.path.splitext(full_name)
    with open(name + ".txt", "w") as f:
        f.write(text)

每个输出文本文件将与相应的输入JSON文件具有相同的名称,但扩展名为.txt

print(full_name)行仅用于提供有关已处理文件名的信息,程序成功运行不需要该行

相关问题 更多 >

    热门问题