美化组不返回子元素

2条回答

网友

1楼 · 编辑于 2024-09-26 22:13:40

数据通过JavaScript动态加载。您可以使用requests模块来模拟它

例如：

import json
import requests


search_parameters = {
'shapes':  "Round",
'cuts':    "Fair,Good,Very Good,Ideal,Super Ideal",
'colors':  "J,I,H,G,F,E,D",
'clarities':   "SI2,SI1,VS2,VS1,VVS2,VVS1,IF,FL",
'polishes':    "Good,Very Good,Excellent",
'symmetries':  "Good,Very Good,Excellent",
'fluorescences':   "Very Strong,Strong,Medium,Faint,None",
'min_carat':   "0.25",
'max_carat':  "11.58",
'min_table':   "50.00",
'max_table':   "86.00",
'min_depth':   "46.20",
'max_depth':   "629.00",
'min_price':   "420",
'max_price':   "1258930",
'stock_number':    "",
'row': "0",
'page':    "1",
'requestedDataSize':   "200",
'order_by':    "price",
'order_method':    "asc",
'currency':    "$",
'has_v360_video':  "",
'dedicated':   "",
'sid': "",
'min_ratio':   "1.00",
'max_ratio':   "2.75",
'shipping_day':    "",
'MIN_PRICE':   "420",
'MAX_PRICE':   "1258930",
'MIN_CARAT':   "0.25",
'MAX_CARAT':  "11.58",
'MIN_TABLE':   "45",
'MAX_TABLE':   "86",
'MIN_DEPTH':   "46.2",
'MAX_DEPTH':   "629"
}

data = requests.get('https://www.brilliantearth.com/loose-diamonds/list/', params=search_parameters).json()

# uncomment this to print all data:
# print(json.dumps(data, indent=4))

for d in data['diamonds']:
    print('{:<30} {:<15} {}'.format(d['title'], d['cut'], d['price']))

印刷品：

0.30 Carat Round Diamond       Very Good       420
0.30 Carat Round Diamond       Very Good       420
0.30 Carat Round Diamond       Ideal           430
0.30 Carat Round Diamond       Ideal           430
0.30 Carat Round Diamond       Good            430
0.30 Carat Round Diamond       Ideal           430
0.30 Carat Round Diamond       Very Good       430
0.25 Carat Round Diamond       Super Ideal     430
0.30 Carat Round Diamond       Very Good       430
0.32 Carat Round Diamond       Ideal           430

... and so on.

网友

2楼 · 编辑于 2024-09-26 22:13:40

您可以使用selenium解析html。您可以尝试：

from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Firefox()
driver.get('https://www.brilliantearth.com/design-your-own-engagement-ring/?sid=3755106&dc=')

html = driver.page_source
soup = BeautifulSoup(html)


rows = soup.find_all("div", {"id": "diamonds_search_table"})
print(rows)

您将获得如下所示的所有行：

[<div class="search-table" id="diamonds_search_table" style="position: relative; height: 34000px;">
<div class="inner item" data-have="true" data-position="0" style="position: absolute; width: 100%; height: 34px;top:0px;"><a class="td-n2" href="/rings/cyorings/view_diamond/9361809/?sid=3755106&amp;first=diamond&amp;show_diamond_tab=true"></a><table border="0" cellpadding="0" cellspacing="0" class="table-striped table-hover search-result-table" width="100%"><tbody><tr class="search-item"><td data-id="9361809" onclick="dtl.stop_jump();" scope="col" width="7%"><div class="checkbox checkbox-ty4"><label><input class="hidden"/><span class="sr-only">checkbox</span><i class="icons-checkbox"></i></label></div></td><td scope="col" width="9%">Round</td><td scope="col" width="9%">0.30</td><td scope="col" width="8%">H</td><td scope="col" width="8%">SI2</td><td scope="col" width="12%">Very Good</td><td scope="col" width="8%">GIA</td><td scope="col" width="12%">Botswana Sort</td><td class="width_ratio_hide" scope="col" width="8%">1</td><td scope="col" width="10%">$420</td><td scope="col" width="7%"><span class="view">View</span></td></tr></tbody></table></div><div class="inner item" data-have="true" data-position="34" style="position: absolute; width: 100%; height: 34px;top:34px;"><a class="td-n2" href="/rings/cyorings/view_diamond/9391074/?sid=3755106&amp;first=diamond&amp;show_diamond_tab=true"></a><table border="0" cellpadding="0" cellspacing="0" class="table-striped table-hover search-result-table" width="100%"><tbody><tr class="search-item"><td data-id="9391074"


and so on...........]

相关问题更多 >

编程相关推荐

热门问题

热门文章

美化组不返回子元素

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >