在python中如何在双大括号内获取特定数据

2024-09-28 05:17:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图从一个网站的双卷曲{}内的字符串中获取一个特定的数据。怎样才能取出这些数据?以下是网站上双卷发的剪报:

<div class="swatch-data">
{"thumbnailImageUrl":"https://www.jbl.com.ph/dw/image/v2/AAUJ_PRD/on/demandware.static/-/Sites-masterCatalog_Harman/default/dw367304ef/JBL_Endurance-SPRINT_Product-Image_Black_Front-1605x1605px.jpg?sw=270&amp;sh=330&amp;sm=fit&amp;sfrm=png","productUrl":"https://www.jbl.com.ph/JBL+Endurance+SPRINT.html?cgid=in-ear-headphones&amp;dwvar_JBL%20Endurance%20SPRINT_color=Black-GLOBAL-","productSupportUrl":"","productID":"JBLENDURSPRINTBLK","orderable":false,"availability":{"message":"","status":"NOT_AVAILABLE"},"price":{"unitLabel":"each","priceType":"standard","salesPrice":"N/A"},"realprice":{"salesPrice":"N/A"},"badges":["new"],"buttonText":"Sold Out","showProdLimit":{"status":""},"CTAEnable":true,"commerceSiteFlag":false,"showPromoTimerFlag":false,"isProProd":false}
</div>

谢谢。在

编辑: 我确实用了beauthoulsoup4,只是我真的只是个傻瓜,还没来过JSON。在


Tags: 数据httpsdivcomfalse网站wwwstatus
2条回答

bs4的示例

import bs4
import json

html = """
<div class="swatch-data">
{"thumbnailImageUrl":"https://www.jbl.com.ph/dw/image/v2/AAUJ_PRD/on/demandware.static/-/Sites-masterCatalog_Harman/default/dw367304ef/JBL_Endurance-SPRINT_Product-Image_Black_Front-1605x1605px.jpg?sw=270&amp;sh=330&amp;sm=fit&amp;sfrm=png","productUrl":"https://www.jbl.com.ph/JBL+Endurance+SPRINT.html?cgid=in-ear-headphones&amp;dwvar_JBL%20Endurance%20SPRINT_color=Black-GLOBAL-","productSupportUrl":"","productID":"JBLENDURSPRINTBLK","orderable":false,"availability":{"message":"","status":"NOT_AVAILABLE"},"price":{"unitLabel":"each","priceType":"standard","salesPrice":"N/A"},"realprice":{"salesPrice":"N/A"},"badges":["new"],"buttonText":"Sold Out","showProdLimit":{"status":""},"CTAEnable":true,"commerceSiteFlag":false,"showPromoTimerFlag":false,"isProProd":false}
</div>
"""

soup=bs4.BeautifulSoup(html,'lxml')
js_data = json.loads(soup.find('div').text)

# if you want productID just get it
print(js_data['productID'])

输出

^{pr2}$

你看到的实际上是一个JSON。在

首先需要去掉div。使用BeautifulSoup是推荐的方法之一。在

然后,可以使用json.loads(str)加载字符串。在

相关问题 更多 >

    热门问题