如何在python中恢复错误的json

2024-09-30 18:21:13 发布

您现在位置:Python中文网/ 问答频道 /正文

现在,我必须处理一个json不好的数据库。 因此,当我试图在python中每1行恢复所有第1行时,这是不可能的,因为数据库总是有问题

picture of some line of database

我不知道如何恢复它来处理这个数据库

我试过这个:

my_json = {

"name": "Khau0304lid Muhu0323ammad u02bbAliu0304 al-Hu0323au0304jj",
"personal_name": "Khau0304lid Muhu0323ammad u02bbAliu0304 al-Hu0323au0304jj",
"last_modified": {
    "type": "/type/datetime",
    "value": "2008-08-20T17:57:09.66187"
},
"key": "/authors/OL1000057A",
"type": {
    "key": "/type/author"
},
"revision": 2

}

name = my_json.get('name', "")
print(name)

但是当我有一个以上的名字时,它就不起作用了

谢谢你的关注


Tags: keyname数据库jsonmytypepersonalmodified
3条回答

下面是一个使用字符串的示例

my_json = '{"name": "Halo","action": "what?","name": "haha","action": "what?","name": "zzzz","action": "what?"}'
import json
def handle(lst):
    result = {}
    count = {}
    for key, val in lst:
        if key in count:
            count[key] = 1 + count[key]
        else:
            count[key] = 1
        if key in result:
            if count[key] > 2:
                result[key].append(val)
            else:
                result[key] = [result[key], val]
        else:
            result[key] = val
    return result
my_json = json.loads(my_json,object_pairs_hook=handle)
print(my_json['name'])

结果将是['Halo', 'haha', 'zzzz']

如果你有一个json文件, 然后

import json
def handle(lst):
    result = {}
    count = {}
    for key, val in lst:
        if key in count:
            count[key] = 1 + count[key]
        else:
            count[key] = 1
        if key in result:
            if count[key] > 2:
                result[key].append(val)
            else:
                result[key] = [result[key], val]
        else:
            result[key] = val
    return result
with open("./a.json","r") as f:
    my_json = json.load(f,object_pairs_hook=handle)
print(my_json['name'])

更多关于object_pairs_hook

好的,下面是一个示例,用于分割许多json字符串

my_json = '{"name": "Khau0304lid Muhu0323ammad u02bbAliu0304 al-Hu0323au0304jj",\
"personal_name": "Jacques", "last_modified": {"type": "/type/datetime", "value":\
"2008-08-20T17:57:09.66187"}, "key": "/authors/OL1000057A", "type": {"key":\
"/type/author"}, "revision": 2},{"name": "sdsdsdzzdzdfdfe", "personal_name":\
"Khau0304lid Muhu0323ammad u02bbAliu0304 al-Hu0323au0304jj", "last_modified": {"type":\
"/type/datetime", "value": "2008-08-20T17:57:09.66187"}, "key": "/authors/OL1000057A",\
"type": {"key": "/type/author"}, "revision": 2}'
JsonList = []
Stack = []
LastJsonEndIndex = 0
PassADot = False
for i in range(len(my_json)):
    if PassADot:
        PassADot = False
        continue
    if my_json[i] == "{":
        Stack.append(my_json[i])
    elif my_json[i] == "}":
        Stack.pop()
        if Stack == []:
            JsonList.append(my_json[LastJsonEndIndex:i+1])
            LastJsonEndIndex = i+2
            PassADot = True
    else:
        pass
print(JsonList)

JsonList中的每个元素都是一个完整的json字符串。您可以将JsonList中的每个元素保存到json文件中,然后运行我首先发布的内容

保存它:

my_json = '{"name": "Nazamiu0304 Rau0304majiu0304", "personal_name": "Nazamiu0304 Rau0304majiu0304", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:00:41.270799"}, "key": "/authors/OL1001461A", "type": {"key": "/type/author"}, "revision": 2} {"name": "Harald A. Enge", "personal_name": "Harald A. Enge", "created": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "alternate_names": ["Harald A Enge"], "last_modified": {"type": "/type/datetime", "value": "2013-02-25T09:47:06.574533"}, "latest_revision": 3, "key": "/authors/OL1001542A", "type": {"key": "/type/author"}, "revision": 3} {"name": "Umu Hilmy", "personal_name": "Umu Hilmy", "last_modified": {"type": "/type/datetime", "value": "2008-09-08T16:20:28.105165"}, "key": "/authors/OL100223A", "type": {"key": "/type/author"}, "revision": 2} {"name": "Ismail Ibrahim Dr.", "title": "Dr.", "personal_name": "Ismail Ibrahim", "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "key": "/authors/OL100304A", "type": {"key": "/type/author"}, "revision": 1} {"bio": {"type": "/type/text", "value": "> "Eversley, William Pinder, B.C.L. Queen\'s Coll., Oxon, M.A., a member of the South-eastern circuit, reporter for Law Times in Queen\'s Bench division, a student of the Inner Temple 14 April, 1874 (then aged 23), called to the bar 25 April, 1877 (eldest son of William Eversley, Esq., of London); born u2060, 1851. rn> rn> 7, King\'s Bench Walk, Temple, E.C." rn> ...[in Foster\'s Men at the Bar][1]rnrnrn rnrn[1]: https://en.wikisource.org/wiki/Men-at-the-Bar/Eversley,_William_Pinder "Men at the Bar""}, "name": "William Pinder Eversley", "created": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "death_date": "1918", "photos": [6897255, 6897254], "last_modified": {"type": "/type/datetime", "value": "2018-07-31T15:39:07.982159"}, "latest_revision": 6, "key": "/authors/OL1003081A", "birth_date": "1851", "personal_name": "William Pinder Eversley", "type": {"key": "/type/author"}, "revision": 6} {"name": "Valerie Meyer", "personal_name": "Valerie Meyer", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:22:33.63997"}, "key": "/authors/OL1004062A", "type": {"key": "/type/author"}, "revision": 2} {"name": "Ticonius", "created": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "personal_name": "Ticonius", "birth_date": "4th cent.", "last_modified": {"type": "/type/datetime", "value": "2013-02-25T09:53:08.232734"}, "latest_revision": 2, "key": "/authors/OL1004101A", "date": "4th cent", "type": {"key": "/type/author"}, "revision": 2} {"name": "Abdul Kahar Muzakar", "created": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "death_date": "1965", "last_modified": {"type": "/type/datetime", "value": "2017-03-31T12:48:41.973551"}, "latest_revision": 4, "key": "/authors/OL100450A", "birth_date": "1921", "personal_name": "Abdul Kahar Muzakar", "remote_ids": {"viaf": "11565164", "wikidata": "Q4665459"}, "type": {"key": "/type/author"}, "revision": 4} {"name": "Sheshadri Narayanan", "personal_name": "Sheshadri Narayanan", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:36:13.030909"}, "key": "/authors/OL1005861A", "birth_date": "1936", "type": {"key": "/type/author"}, "revision": 2} {"name": "M. W. Nuttall", "personal_name": "M. W. Nuttall", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:36:46.701507"}, "key": "/authors/OL1005942A", "birth_date": "1947", "type": {"key": "/type/author"}, "revision": 2} {"name": "R.-R Renella", "personal_name": "R.-R Renella", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:40:50.827135"}, "key": "/authors/OL1006482A", "birth_date": "1949", "type": {"key": "/type/author"}, "revision": 2} {"name": "Caesar A. Casanova", "personal_name": "Caesar A. Casanova", "last_modified": {"type": "/type/datetime", "value": "2008-09-08T16:24:07.101641"}, "key": "/authors/OL100656A", "birth_date": "1948", "type": {"key": "/type/author"}, "revision": 2} {"name": "Rodney Fitch", "personal_name": "Rodney Fitch", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:43:01.916355"}, "key": "/authors/OL1006767A", "type": {"key": "/type/author"}, "revision": 2} {"name": "Catherine Ingram", "links": [{"url": "http://catherineingram.com/biography.html", "type": {"key": "/type/link"}, "title": "Biography"}, {"url": "http://www.youtube.com/watch?v=4lJK9cfXP3c", "type": {"key": "/type/link"}, "title": "Interview on Consciousness TV"}, {"url": "http://www.huffingtonpost.com/catherine-ingram/", "type": {"key": "/type/link"}, "title": "Blog on Huffington Post"}], "created": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "personal_name": "Catherine Ingram", "last_modified": {"type": "/type/datetime", "value": "2013-04-05T06:41:53.345668"}, "latest_revision": 4, "key": "/authors/OL1006815A", "birth_date": "1952", "type": {"key": "/type/author"}, "revision": 4}'
Stack = []
LastJsonEndIndex = 0
PassADot = False
Count = 0
for i in range(len(my_json)):
    if PassADot:
        PassADot = False
        continue
    if my_json[i] == "{":
        Stack.append(my_json[i])
    elif my_json[i] == "}":
        Stack.pop()
        if Stack == []:
            with open("./json{}.json".format(str(Count)),"w+") as f:
                f.write(my_json[LastJsonEndIndex:i+1])
            Count += 1
            LastJsonEndIndex = i+2
            PassADot = True
    else:
        pass

它将生成一些json文件。 你可以把它和我的第一篇文章联系起来

我想你可能误解了我。 您的openlibrary5.json在一个文件中有许多json

因此,如果您直接将此文件作为json加载。 像这样,

with open("openlibrary5.json","r") as f:
   yourJson = json.load(f)

将出现错误

所以我建议你可以先把它们切成薄片

with open("openlibrary5.json","r") as f:
    stringJson = f.read()

然后按照我的帖子将它们分割成许多(另存为)json文件。 切片后,其中一个切片的json文件是完整的json。 因此,您可以使用json.load()正常加载它们

如果您真的想将它们聚集在一起,在对它们进行切片之后,您可以打开每个切片的json文件,并将它们放在一个List(使用List.append)

明白了吗

相关问题 更多 >