<p>您要查找的URL并非全部存储在HTML中。需要进一步的请求来返回JSON中的信息。为此,还需要会话ID。例如:</p>
<pre><code>from bs4 import BeautifulSoup
import requests
import json
url = 'https://www.americanexpress.com/in/credit-cards/all-cards/?sourcecode=A0000FCRAA&cpid=100370494&dsparms=dc_pcrid_408453063287_kword_american%20express%20credit%20card_match_e&gclid=Cj0KCQiApY6BBhCsARIsAOI_GjaRsrXTdkvQeJWvKzFy_9BhDeBe2L2N668733FSHTHm96wrPGxkv7YaAl6qEALw_wcB&gclsrc=aw.ds'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
for script in soup.find_all('script'):
if script.contents and "intlUserSessionId" in script.contents[0]:
json_raw = script.contents[0][script.contents[0].find('{'):]
json_data = json.loads(json_raw)
id = json_data["pageData"]["pageValues"]["intlUserSessionId"]
url2 = 'https://acquisition-1.americanexpress.com/api/acquisition/digital/v1/shop/us/cardshop-api/api/v1/intl/content/compare-cards/in/default'
r2 = requests.get(url2, params={'sessionId':id})
json_data = r2.json()
for entry in json_data:
cta_group = entry["ctaGroup"][0]
click_url = cta_group['clickUrl']
print(f"{cta_group['text']} - {click_url}")
learn_more = entry['learnMore']['ctaGroup'][0]
print(f"{learn_more['text']} - {learn_more['clickUrl']}")
</code></pre>
<p>这将为您提供以下链接:</p>
<pre class="lang-none prettyprint-override"><code>Apply Now - https://global.americanexpress.com/acq/intl/dpa/japa/ind/pers/begin.do?perform=IntlEapp:IND:membershiprewards_credit&feePay=P1
Learn more - credit-cards/membership-rewards-card/
Apply Now - https://global.americanexpress.com/acq/intl/dpa/japa/ind/pers/begin.do?perform=IntlEapp:IND:travel_platinum&feePay=T1
Learn more - credit-cards/platinum-travel-credit-card/
Apply Now - https://global.americanexpress.com/acq/intl/dpa/japa/ind/pers/begin.do?perform=IntlEapp:IND:gold_charge&feePay=G4&intlink=mainapplynow
Learn more - charge-cards/gold-card/
Apply Now - https://global.americanexpress.com/acq/intl/dpa/japa/ind/pers/begin.do?perform=IntlEapp:IND:platinum_reserve&feePay=LV&intlink=mainapplynow
Learn more - credit-cards/platinum-reserve-credit-card/
Learn more - credit-cards/jet-airways-platinum-credit-card/
Learn more - credit-cards/jet-airways-platinum-credit-card/
Apply Now - https://global.americanexpress.com/acq/intl/dpa/japa/ind/pers/begin.do?perform=IntlEapp:IND:platinum_charge
Learn more - charge-cards/platinum-card/
Learn more - credit-cards/payback-card/
Learn more - credit-cards/payback-card/
Apply Now - https://global.americanexpress.com/acq/intl/dpa/japa/ind/pers/begin.do?perform=IntlEapp:IND:smart_earn&feepay=ES1
Learn more - credit-cards/smart-earn-credit-card/
</code></pre>
<p>了解更多URL需要添加站点的基本URL</p>