<p>这不是一个完美的答案,但它应该起作用。
首先安装这两个模块
<a href="http://docs.python-requests.org/en/master/" rel="nofollow noreferrer">requests</a>和<a href="https://www.crummy.com/software/BeautifulSoup/bs4/doc/" rel="nofollow noreferrer">BS4</a>:</p>
<blockquote>
<p>pip install requests</p>
<p>pip install beautifulsoup4</p>
</blockquote>
<pre><code>import requests
import json
from bs4 import BeautifulSoup
#setting up the headers
headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Referer': 'https://www.ebay.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.8',
'Host': 'www.ebay.com',
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
}
#setting up my proxy, you can disable it
proxy={
'https':'127.0.0.1:8888'
}
#search terms
search_term='armani'
#request session begins
ses=requests.session()
#first get home page so to set cookies
resp=ses.get('https://www.ebay.com/',headers=headers,proxies=proxy,verify=False)
#next get the search term page to parse request
resp=ses.get('https://www.ebay.com/sch/i.html?_from=R40&_trksid=p2374313.m570.l1313.TR12.TRC2.A0.H0.X'+search_term+'.TRS0&_nkw='+search_term+'&_sacat=0',
headers=headers,proxies=proxy,verify=False)
soup = BeautifulSoup(resp.text, 'html.parser')
items=soup.find_all('a', { "class" : "vip" })
price_items=soup.find_all('span', { "class" : "amt" })
final_list=list()
for item,price in zip(items,price_items):
try:
title=item.getText()
price_val=price.find('span',{"class":"bold"}).getText()
final_list.append((title,price_val))
except Exception as ex:
pass
print(final_list)
</code></pre>
<p>这是我得到的结果</p>
<p><a href="https://i.stack.imgur.com/YbFom.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/YbFom.png" alt="enter image description here"/></a></p>