每个时间段的美化效果不同

def get_soup(url): soup = BeautifulSoup(requests.get(url).content, 'html.parser') return soup def from_soup(soup, myCellsList): cellsList = soup.find_all('li', {'class' : 'product clearfix'}) for i in range (len(cellsList)): ottdDict = {} ottdDict['Name'] = cellsList[i].h3.text.strip()

1条回答

网友

1楼 · 发布于 2024-10-02 08:22:52

在浏览站点页面时，不同元素的html中有细微的差异，获取名称的最佳方法实际上是选择外部div并从锚中提取文本。在

这将获得每个产品的所有信息，并将其放入dicts中，其中的键是“组织”、“细胞”等。。这些值是相关的描述：

import requests

from time import sleep


def from_soup(url):
    with requests.Session() as s:
        s.headers.update({
            "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36"})
    # id for next oage anchor.
    id_ = "#layoutcontent_2_middlecontent_0_threecolumncontent_0_content_ctl00_rptCenterColumn_dcpCenterColumn_0_ctl00_0_productRecords_0_bottomPaging_0_liNextPage_0"

    soup = BeautifulSoup(s.get(url).content)
    for li in soup.select("ul.product-list li.product.clearfix"):
        name = li.select_one("div.product-header.clearfix a").text.strip()
        d = {"name": name}
        for div in li.select("div.search-item"):
            k = div.strong.text
            d[k.rstrip(":")] = " ".join(div.text.replace(k, "", 1).split())
        yield d

    # get anchor for next page and loop until no longer there.
    nxt = soup.select_one(id_)

    # loop until mo more next page.
    while nxt:
        # sleep between requests
        sleep(.5)
        resp = s.get(nxt.a["href"])
        soup = BeautifulSoup(resp.content)
        for li in soup.select("ul.product-list li.product.clearfix"):
            name = li.select_one("div.product-header.clearfix a").text.strip()
            d = {"name": name}
            for div in li.select("div.search-item"):
                k = div.strong.text
                d[k.rstrip(":")] = " ".join(div.text.replace(k,"",1).split())
            yield d

运行后：

^{pr2}$

您将看到1211个包含所有数据的dicts。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

每个时间段的美化效果不同

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >