错误- 使用BeautifulSoup4解析网页时

import requests from bs4 import BeautifulSoup link = "https://www.amazon.in/Power- Banks/b/ref=nav_shopall_sbc_mobcomp_powerbank?ie=UTF8&node=6612025031" def amazon(url): sourcecode = requests.get(url) sourcecode_text = sourcecode.text soup = BeautifulSoup(sourcecode_text) for link in soup.findALL('a', {'class': 'a-link-normal aok-block a- text-normal'}): href = link.get('href') print(href) amazon(link)

2条回答

网友

1楼 · 编辑于 2024-10-01 00:31:18

您可以添加标题。当你做find_all('a')的时候，你可以在那里得到它，它是href:

import requests
from bs4 import BeautifulSoup

link = "https://www.amazon.in/Power-Banks/b/ref=nav_shopall_sbc_mobcomp_powerbank?ie=UTF8&node=6612025031"

def amazon(url):
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}

    sourcecode = requests.get(url, headers=headers)
    sourcecode_text = sourcecode.text
    soup = BeautifulSoup(sourcecode_text, 'html.parser')

    for link in soup.find_all('a', href=True):
        href = link.get('href')
        print(href)

amazon(link)

网友

2楼 · 编辑于 2024-10-01 00:31:18

代码中的问题是您使用了错误的方法名findALL。。 soup对象中没有findALL方法，因此不会为此返回任何方法。要解决新代码使用find\u all的问题，findAll也应该起作用（小写双l）。希望这件事能让你明白。你知道吗

import requests
from bs4 import BeautifulSoup

link = "https://www.amazon.in/Power-Banks/b/ref=nav_shopall_sbc_mobcomp_powerbank?ie=UTF8&node=6612025031"


def amazon(url):
    sourcecode = requests.get(url)
    sourcecode_text = sourcecode.text
    soup = BeautifulSoup(sourcecode_text, "html.parser")
    # add "html.parser" as second arg , so you not get a warning .
    # use soup.find_all for new code , also soup.findAll should work 
    for link in soup.find_all('a', {'class': 'a-link-normal aok-block a-text-normal'}):
        href = link.get('href')
        print(href)

amazon(link)

相关问题更多 >

编程相关推荐

热门问题

热门文章