只返回第一项

import requests from bs4 import BeautifulSoup # Webpage connection html = "https://www.wegochem.com/chemicals/organic-intermediates/supplier-distributor" r=requests.get(html) c=r.content soup=BeautifulSoup(c,"html.parser") # Grab title-artist classes and store in recordList wegoList = soup.find_all("tbody") try: for items in wegoList: material = items.find("td", {"class": "click_whole_cell",}).get_text().strip() cas = items.find("td", {"class": "text-center",}).get_text().strip() category = items.find("div", {"class": "text-content short-text",}).get_text().strip() print(material,cas,category) except: pass

2条回答

网友

1楼 · 编辑于 2024-09-30 08:15:35

for items in wegoList:循环遍历tbody的列表，然后尝试从整个表中提取属性，但应该遍历tr行：

wegoList = soup.find_all("tbody")

try:
    soup=BeautifulSoup(wegoList.__str__(),"html.parser")
    trs = soup.find_all('tr') #Makes list of rows

    for tr in trs: 
        material = tr.find("td", {"class": "click_whole_cell",}).get_text().strip()

        cas = tr.find("td", {"class": "text-center",}).get_text().strip()

        category = tr.find("div", {"class": "text-content short-text",}).get_text().strip()

    print(material,cas,category)

网友

2楼 · 编辑于 2024-09-30 08:15:35

请尝试以下代码：

import requests
from bs4 import BeautifulSoup



# Webpage connection
html = "https://www.wegochem.com/chemicals/organic-intermediates/supplier-distributor"
r=requests.get(html)
c=r.content
soup=BeautifulSoup(c,"html.parser")
# Grab title-artist classes and store in recordList

wegoList = soup.find_all("tbody")

try:
    for items in wegoList:
        material = items.find_all("td", {"class": "click_whole_cell",})
        for i in material:
            print(i.get_text().strip())

        cas = items.find_all("td", {"class": "text-center",})
        for i in cas:
            print(i.get_text().strip())

        category = items.find_all("div", {"class": "text-content short-text",})
        for i in category:
            print(i.get_text().strip())

except:
    pass

Updated code:

^{pr2}$

输出：

1,2-Dimethylimidazole 1739-84-0 Organic Intermediates, Plastic, Resin & Rubber, Coatings
1,6-Hexanediol 629-11-8 Adhesives & Sealants, Industrial Chemicals, Inks & Digital Inks, Organic Intermediates, Plastic, Resin & Rubber, Coatings
2,2,4-Trimethyl-1,3-Pentanediol Monoisobutyrate 25265-77-4 Inks & Digital Inks, Oil Field Services, Organic Intermediates, Solvents & Degreasers, Coatings
2,6-Dichloroaniline 608-31-1 Agricultural Chemicals, Crop Protection, Organic Intermediates

相关问题更多 >

编程相关推荐

热门问题

热门文章