我想迭代一个beautifulsoup对象,该对象根据找到的与HTML标记匹配的元素的数量更改长度
driver.get('https://www.inspection.gc.ca/food-recall-warnings-and-allergy-alerts/2021-02-10/eng/1613010591343/1613010596418')
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'html.parser')
recall_details = soup.find('table', class_ = 'table table-bordered table-condensed')
recalled_products = recall_details.find_all('td')
recalled_products
输出:
[<td>One Ocean</td>,
<td>Sliced Smoked Wild Sockeye Salmon</td>,
<td>300 g</td>,
<td>6 25984 00005 3</td>,
<td>11253</td>]
我想迭代每个td元素并将其附加到如下列表中:
brands = []
products = []
sizes = []
upcs = []
codes = []
brand = recalled_products[0].text
product = recalled_products[1].text
size = recalled_products[2].text
upc = recalled_products[3].text
code = recalled_products[4].text
brands.append(brand)
products.append(product)
sizes.append(size)
upcs.append(upc)
codes.append(code)
print(brands)
print(products)
print(sizes)
print(upcs)
print(codes)
输出:
['One Ocean']
['Sliced Smoked Wild Sockeye Salmon']
['300\xa0g']
['6\xa025984\xa000005\xa03']
['11253']
我尝试了以下代码,但没有得到预期的结果。我想我需要某种柜台
for i in range(len(recalled_products)):
brand = recalled_products[i].text
product = recalled_products[i].text
size = recalled_products[i].text
upc = recalled_products[i].text
code = recalled_products[i].text
brands.append(brand)
products.append(product)
sizes.append(size)
upcs.append(upc)
codes.append(code)
print(brands)
print(products)
print(sizes)
print(upcs)
print(codes)
```
Output:
```
['One Ocean', 'Sliced Smoked Wild Sockeye Salmon', '300\xa0g', '6\xa025984\xa000005\xa03', '11253']
['One Ocean', 'Sliced Smoked Wild Sockeye Salmon', '300\xa0g', '6\xa025984\xa000005\xa03', '11253']
['One Ocean', 'Sliced Smoked Wild Sockeye Salmon', '300\xa0g', '6\xa025984\xa000005\xa03', '11253']
['One Ocean', 'Sliced Smoked Wild Sockeye Salmon', '300\xa0g', '6\xa025984\xa000005\xa03', '11253']
['One Ocean', 'Sliced Smoked Wild Sockeye Salmon', '300\xa0g', '6\xa025984\xa000005\xa03', '11253']
提前感谢您提供的任何帮助
这就是我获取标记的方式
印刷品
我确实认为dict是比多个列表更好的数据结构,但当然这取决于您的用例
如果您想这样做,可以如下更改代码:
印刷品
关于数据的问题是从
或
对于二维阵列,您希望使用索引二维阵列
对于B,您希望在迭代中使用一个步骤
在我看来,这似乎需要构建一个电子表格来保存需要存储的数据。您可以使用名为openpyxl的库来执行此操作,然后为品牌、产品、尺寸、UPC和代码创建列。然后将来自beautifulsoup对象的结果存储到电子表格中
相关问题 更多 >
编程相关推荐