我想刮: https://www.loft.com/loft-plus-floral-maxi-shirtdress/514793
我已经成功地完成了描述。然而,我无法刮产品图片和建议。我在下面提到的代码以前曾用于其他一些与时尚相关的网站,但似乎不适用于此
#main method
d = webdriver.Chrome('/Users/fatima.arshad/Downloads/chromedriver')
d.get(url)
start = BeautifulSoup(d.page_source, 'html.parser')
Image_URL = self.saveImage("./products/", product_name, start)
recommendations = self.getRecommendations(start, d)
def getRecommendations(self,start,d):
#code to scroll to the bottom of page
recommended = []
s = start.find_all('div', class_='swiper-container swiper-container-horizontal')
while not s :
s = start.find_all('div', class_='swiper-container swiper-container-horizontal')
for data in start.find_all('div', class_='swiper-container swiper-container-horizontal'):
for a in data.find_all('a'):
print(a.get('href')) # for getting link
print(a.text) # for getting text between the link
recommended.append("https://loft.com"+str(a.get('href')))
def saveImage(self, foldername, product_name,start):
##some other code
s = start.find('div', class_='swiper-wrapper')
for i in start.find_all('div', class_='swiper-wrapper'):
for img in i.select('img'):
print(img['src'])
urllib.request.urlretrieve("http://"+img['src'], foldername + "/" + product_name + str(c) + ".jpg")
c = c + 1
问题是这两种方法都不返回任何结果。我将循环放入getRecommendations()中,以便它最终得到一些东西,但仍然一无所获
链接是动态构建的。您可以在network选项卡中查看GET请求,该请求以json格式检索用于构建新图像url的信息
您可以模仿这些步骤:
相关问题 更多 >
编程相关推荐