回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我在一个使用json的示例网站上学习了一些技巧。例如,以以下示例网站为例:<a href="http://www.charitystars.com/product/juve-chelsea-3-0-champions-league-jersey-autographed-by-giorgio-chiellini" rel="nofollow noreferrer">http://www.charitystars.com/product/juve-chelsea-3-0-champions-league-jersey-autographed-by-giorgio-chiellini</a>。源代码在这里<code>view-source:https://www.charitystars.com/product/juve-chelsea-3-0-champions-league-jersey-autographed-by-giorgio-chiellini</code>。我想在第388-396行获得信息:</p>
<pre><code><script>
var js_data = {"first_time_bid":true,"yourbid":0,"product":{"id":55,"item_number":"P55","type":"PRODUCT","fixed":0,"price":1000,"tot_price":1000,"min_bid_value":1010,"currency":"EUR","raise_bid":10,"stamp_end":"2013-06-14 12:00:00","bids_number":12,"estimated_value":200,"extended_time":0,"url":"https:\/\/www.charitystars.com\/product\/juve-chelsea-3-0-champions-league-jersey-autographed-by-giorgio-chiellini","conversion_value":1,"eid":0,"user_has_bidded":false},"bid":{"id":323,"uid":126,"first_name":"Fabio","last_name":"Gastaldi","company_name":"","is_company":0,"title":"fab1","nationality":"IT","amount":1000,"max_amount":0,"table":"","stamp":1371166006,"real_stamp":"2013-06-14 01:26:46"}};
var p_currency = '€';
var conversion_value = '1';
var merch_items = [];
var gallery_items = [];
var inside_gala = false;
</script>
</code></pre>
<p>并将每个变量用引号(即“id”、“item_number”、“type”、…)保存在具有相同名称的变量中</p>
<p>到目前为止,我成功地运行了以下程序</p>
<pre><code>import requests
from bs4 import BeautifulSoup
from urllib import urlopen
import re
import json
import time
import csv
from bs4 import BeautifulSoup as soup
from pandas import DataFrame
import urllib2
hdr = {"User-Agent": "My Agent"}
req = urllib2.Request(http://www.charitystars.com/product/juve-chelsea-3-0-champions-league-jersey-autographed-by-giorgio-chiellini)
response = urllib2.urlopen(req)
htmlSource = response.read()
soup = BeautifulSoup(htmlSource)
title = soup.find_all("span", {"itemprop": "name"}) # get the title
script_soup = soup.find_all("script")
</code></pre>
<p>出于某种原因,script_soup有很多我不需要的信息。我相信我需要的部分在<code>script_soup[9]</code>,但我不知道如何(以有效的方式)访问它。我真的很感谢你的帮助</p>