解析html标记，基于使用beautiful soup的类和ref标记

maxx = soup.findAll("href", {"class: "yil-biz-ttl"}) ------------------------------------------------------------ File "<ipython console>", line 1 maxx = soup.findAll("href", {"class: "yil-biz-ttl"}) ^ SyntaxError: invalid syntax

3条回答

网友

1楼 · 编辑于 2024-09-27 07:29:25

soup.findAll('a', {'class': 'yil-biz-ttl'})[0]['href']

要查找所有此类链接：

for link in soup.findAll('a', {'class': 'yil-biz-ttl'}):
    try:
        print link['href']
    except KeyError:
        pass

网友

2楼 · 编辑于 2024-09-27 07:29:25

在"class之后缺少右引号：

 maxx = soup.findAll("href", {"class: "yil-biz-ttl"})

应该是

 maxx = soup.findAll("href", {"class": "yil-biz-ttl"})

另外，我不认为您可以搜索像href这样的属性，我认为您需要搜索标记：

 maxx = [link['href'] for link in soup.findAll("a", {"class": "yil-biz-ttl"})]

网友

3楼 · 编辑于 2024-09-27 07:29:25

要从CSS类"yil-biz-ttl"中查找所有具有href属性且其中包含任何内容的<a/>元素：

from bs4 import BeautifulSoup  # $ pip install beautifulsoup4

soup = BeautifulSoup(HTML)
for link in soup("a", "yil-biz-ttl", href=True):
    print(link['href'])

目前所有其他答案都不符合上述要求。

相关问题更多 >

编程相关推荐

热门问题

热门文章

解析html标记，基于使用beautiful soup的类和ref标记

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >