在python中多次解析同一个JSON字符串

2024-07-03 05:33:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要解析下面的json输出,这样我就可以解析Title条目了

[{"Title":"000webhost","Name":"000webhost","Domain":"000webhost.com","BreachDate":"2015-03-01","AddedDate":"2015-10-26T23:35:45Z","ModifiedDate":"2015-10-26T23:35:45Z","PwnCount":13545468,"Description":"In approximately March 2015, the free web hosting provider <a href=\"http://www.troyhunt.com/2015/10/breaches-traders-plain-text-passwords.html\" target=\"_blank\" rel=\"noopener\">000webhost suffered a major data breach</a> that exposed over 13 million customer records. The data was sold and traded before 000webhost was alerted in October. The breach included names, email addresses and plain text passwords.","DataClasses":["Email addresses","IP addresses","Names","Passwords"],"IsVerified":true,"IsFabricated":false,"IsSensitive":false,"IsActive":true,"IsRetired":false,"IsSpamList":false,"LogoType":"png"},{"Title":"Lifeboat","Name":"Lifeboat","Domain":"lbsg.net","BreachDate":"2016-01-01","AddedDate":"2016-04-25T21:51:50Z","ModifiedDate":"2016-04-25T21:51:50Z","PwnCount":7089395,"Description":"In January 2016, the Minecraft community known as Lifeboat <a href=\"https://motherboard.vice.com/read/another-day-another-hack-7-million-emails-and-hashed-passwords-for-minecraft\" target=\"_blank\" rel=\"noopener\">was hacked and more than 7 million accounts leaked</a>. Lifeboat knew of the incident for three months before the breach was made public but elected not to advise customers. The leaked data included usernames, email addresses and passwords stored as straight MD5 hashes.","DataClasses":["Email addresses","Passwords","Usernames"],"IsVerified":true,"IsFabricated":false,"IsSensitive":false,"IsActive":true,"IsRetired":false,"IsSpamList":false,"LogoType":"svg"}]

为了分析这个问题,我使用以下代码:

cat $myfile | python -c "import sys, json; print json.load(sys.stdin)[0]['Title']"

但这会导致输出:

000webhost

而我需要的输出是:

000webhost

Lifeboat


Tags: andthecomjsonfalsetruedatatitle
2条回答

如果要显示所有标题,则需要循环数组中的项。当前您正在请求第一项[0]。你知道吗

您可以使用理解提取标题作为一行:

[item['Title'] for item in json.load(sys.stdin)]

然后循环将每个标题打印在单独的行上:

for title in [item['Title'] for item in json.load(sys.stdin)]: print title

所以完整的命令行脚本是:

cat $myfile | python -c "import sys, json; for title in [item['Title'] for item in json.load(sys.stdin)]: print title"

你真的应该用一个合适的脚本来做这件事。另外,这是对cat的多余使用,您应该将Bash参数扩展放在双引号内,以防止分词。如果确定路径中不包含空格,可以省略引号,但这并不是一个好习惯。你知道吗

无论如何,这段代码在python2和python3中都可以使用。你知道吗

python -c "import sys,json;print('\n'.join([u['Title']for u in json.load(open(sys.argv[1]))]))" "$myfile"

输出

000webhost
Lifeboat

下面是如何把它写成一个合适的脚本。你知道吗

import sys
import json

with open(sys.argv[1]) as f:
    data = json.load(f)
print('\n'.join([u['Title'] for u in data]))

相关问题 更多 >