我正在使用bs4 for Python,我想从网页获取json,但它是这样的:
<script>
vtex.events.addData({"pageCategory":"Product","pageDepartment":"Calzado","pageUrl":"http://www.taf.com.mx/air-force-1-07-lv8-cu8070-100/p","pageTitle":"AIR FORCE 1 07 LV8 | MASCULINO - tafmx","skuStockOutFromShelf":[],"skuStockOutFromProductDetail":["23312","23313","23314","23316","23325","23326","23327","23328"],"shelfProductIds":["140","141","142","3775","3777","3782","3785","545","17","314","318","530","645","801","822","940"],"accountName":"tafmx","pageFacets":[],"productId":"3829","productReferenceId":"CU8070-100","productEans":["194502172393","194502172409","194502172416","194502172423","194502172430","194502172447","194502172454","194502172461","194502172478","194502172485","194502172492","194502172508","194502172515","194502172522","194502172539","194502172546","194502172553"],"skuStocks":{"23312":0,"23313":0,"23314":0,"23315":11,"23316":0,"23317":19,"23318":29,"23319":22,"23320":12,"23321":7,"23322":9,"23323":15,"23324":14,"23325":0,"23326":0,"23327":0,"23328":0},"productName":"AIR FORCE 1 07 LV8","productBrandId":2000004,"productBrandName":"Nike","productDepartmentId":7,"productDepartmentName":"Calzado","productCategoryId":8,"productCategoryName":"Sneakers","productListPriceFrom":"2199","productListPriceTo":"2199","productPriceFrom":"2199","productPriceTo":"2199","sellerId":"1","sellerIds":"1"});
</script>
为python使用beautifulsoup,但没有可识别的类
多谢各位
您只需使用
'script'
标记查找元素:js_code
然后棘手的部分是从中获取json。 对于这类任务,我很少使用regex,但这是一个罕见的例子
这个输出
可以使用
json.loads
将其转换为Pythondict
输出
请注意,如果
script
标记包含问题中未显示的其他内容,则可能需要使用更复杂的正则表达式相关问题 更多 >
编程相关推荐