<p>{<cd2>}中的{<cd1>}是多个部分的组合</p>
<p>最好在这些部分中逐一提取细节</p>
<p>最好为元素找到<code>relative xpaths</code></p>
<pre><code>url = 'https://www.myntra.com/kurtas/jompers/jompers-men-yellow-printed-straight-kurta/11226756/buy'
# df = pd.DataFrame(columns=['name','title','price','description','Size & fit','Material & care', 'Complete the look'])
metadata = dict.fromkeys(['name','title','price','description','Size & fit','Material & care','Specifications', 'Complete the look'])
from selenium.common.exceptions import NoSuchElementException
driver = webdriver.Chrome('chromedriver')
specs = dict()
specfication = []
for i in range(1): #len(links)
driver.get(url)
try:
metadata['title'] = driver.find_element_by_class_name('pdp-title').get_attribute("innerHTML")
metadata['name'] = driver.find_element_by_class_name('pdp-name').get_attribute("innerHTML")
metadata['price'] = driver.find_element_by_class_name('pdp-price').find_element_by_xpath('./strong').get_attribute("innerHTML")
# Details were extracted even without scrolling, but it would be better to scroll down.
driver.execute_script("arguments[0].scrollIntoView(true);",driver.find_element_by_xpath("//div[@class='pdp-productDescriptorsContainer']"))
metadata['description'] = driver.find_element_by_xpath("//p[@class='pdp-product-description-content']").text
#metadata['Specifications'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[1]/div[1]').text
if driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[2]'):
print('yes')
element = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[2]')
element.click()
metadata['Size & fit'] = driver.find_element_by_xpath("//h4[contains(text(),'Size')]/following-sibling::p").text
metadata['Material & care']=driver.find_element_by_xpath("//h4[contains(text(),'Material')]/following-sibling::p").text
# from Sleeve Length to Hemline
specn1 = driver.find_elements_by_xpath("//div[@class='index-sizeFitDesc']/div[1]/div")
for spec in specn1:
key = spec.find_element_by_xpath("./div[@class='index-rowKey']").text
value = spec.find_element_by_xpath("./div[@class='index-rowValue']").text
specfication.append([key,value])
#from Colour Family to Occasion
specn2 = driver.find_elements_by_xpath("//div[@class='index-sizeFitDesc']/div[2]/div[1]/div")
for spec in specn2:
key = spec.find_element_by_xpath("./div[@class='index-rowKey']").text
value = spec.find_element_by_xpath("./div[@class='index-rowValue']").text
specfication.append([key, value])
metadata['Specifications'] = specfication
metadata['Complete the look'] = driver.find_element_by_xpath("//h4[contains(text(),'Complete')]/following-sibling::p").text
# metadata['Complete the look'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[8]/div/div[4]/div[2]/div/p/p').text
except Exception as e:
print(e)
pass
for key,value in metadata.items():
print(f"{key} : {value}")
# df = df.append(metadata, ignore_index=True)
</code></pre>
<pre><code>yes
name : Men Yellow Printed Straight Kurta
title : Jompers
price : Rs. 892
description : Yellow printed straight kurta, has a mandarin collar, long sleeves, straight hem, and side slits
Size & fit : The model (height 6') is wearing a size M
Material & care : Material: Cotton
Hand Wash
Specifications : [['Sleeve Length', 'Long Sleeves'], ['Shape', 'Straight'], ['Neck', 'Mandarin Collar'], ['Print or Pattern Type', 'Solid'], ['Design Styling', 'Regular'], ['Slit Detail', 'Side Slits'], ['Length', 'Knee Length'], ['Hemline', 'Straight'], ['Colour Family', 'Bright'], ['Weave Pattern', 'Regular'], ['Weave Type', 'Machine Weave'], ['Occasion', 'Daily']]
Complete the look : Sport this classic kurta from Jompers this season. Achieve a comfortably chic look for your next dinner party or family outing when you team this yellow piece with slim trousers and minimal flair.
</code></pre>