我想在Myntra网站上提取规格和“完整外观”,只有单击“显示更多”才能看到。我为此编写了以下代码:
url = 'https://www.myntra.com/kurtas/jompers/jompers-men-yellow-printed-straight-kurta/11226756/buy'
df = pd.DataFrame(columns=['name','title','price','description','Size & fit','Material & care', 'Complete the look'])
metadata = dict.fromkeys(['name','title','price','description','Size & fit','Material & care', 'Complete the look'])
from selenium.common.exceptions import NoSuchElementException
driver = webdriver.Chrome('chromedriver')
specs = dict()
for i in range(1): #len(links)
driver.get(url)
try:
metadata['title'] = driver.find_element_by_class_name('pdp-title').get_attribute("innerHTML")
metadata['name'] = driver.find_element_by_class_name('pdp-name').get_attribute("innerHTML")
metadata['price'] = driver.find_element_by_class_name('pdp-price').find_element_by_xpath('./strong').get_attribute("innerHTML")
metadata['description'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[8]/div/div[1]/p').text
#metadata['Specifications'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[1]/div[1]').text
if driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[2]'):
print('yes')
element = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[2]')
element.click()
for i in range(1,20):
try:
specs[driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[{}]/div[1]'.format(i)).text] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[{}]/div[2]'.format(i)).text
except:
break
metadata['Complete the look'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[8]/div/div[4]/div[2]/div/p/p').text
except NoSuchElementException:
pass
df = df.append(metadata, ignore_index=True)
我在输出中得到一个“是”,我想这表明单击了“显示更多”选项,但在我的数据框的“完成外观”列中得到一个“无”。如何获取隐藏在“show more”中的详细信息,它具有以下标签:
<div class="index-sizeFitDesc">
<h4 class="index-sizeFitDescTitle index-product-description-title" style="padding-bottom: 12px;">Specifications</h4>
<div class="index-tableContainer">
<div class="index-row">
<div class="index-rowKey">Sleeve Length</div>
<div class="index-rowValue">Long Sleeves</div>
</div><div class="index-row">
<div class="index-rowKey">Shape</div>
<div class="index-rowValue">Straight</div>
</div><div class="index-row">
<div class="index-rowKey">Neck</div>
<div class="index-rowValue">Mandarin Collar</div>
</div><div class="index-row">
<div class="index-rowKey">Print or Pattern Type</div>
<div class="index-rowValue">Geometric</div>
</div><div class="index-row">
<div class="index-rowKey">Design Styling</div>
<div class="index-rowValue">Regular</div></div>
<div class="index-row">
<div class="index-rowKey">Slit Detail</div>
<div class="index-rowValue">Side Slits</div>
</div><div class="index-row">
<div class="index-rowKey">Length</div>
<div class="index-rowValue">Above Knee</div>
</div><div class="index-row">
<div class="index-rowKey">Hemline</div>
<div class="index-rowValue">Curved</div></div></div>
<div class="index-showMoreText">See More</div></div>
我没有通读您编写的所有代码,但要单击“显示更多”,我尝试了下面的代码,可能您可以使用现有代码插入下面的代码
我们必须
scroll to that particular element
让Selenium
知道元素的确切位置我用JS
.click()
点击显示更多示例代码:
导入:
输出:
{}中的{}是多个部分的组合
最好在这些部分中逐一提取细节
最好为元素找到
relative xpaths
相关问题 更多 >
编程相关推荐