从字符串中剪切特定部分，删除HTML标记。[网页抓取]

2024-10-16 17:15:53 发布

男 | 程序猿一只，喜欢编程写python代码。

我得到的结果如下：

<a class="ellipsis" href="https://www.link.com" title="Name of the hyperlink ">Name of the hyperlink </a>

我只想提取一个变量（例如link）的链接，以及另一个变量（例如name）的名称。这是到目前为止我的代码

def supa(linko):
    r = get(linko, headers=ua)
    return BeautifulSoup(r.content, 'html.parser')


soup = supa(base_url + search)
the_icons = soup.find_all('div', class_='caption')

for icon in the_icons:
    name = icon.find('a', class_='ellipsis')

    print(name)

Tags： of the name link find class icon icons

1条回答

网友

1楼 · 发布于 2024-10-16 17:15:53

您只需在查找的末尾添加['href']：

for icon in the_icons:
    name = icon.find('a', class_='ellipsis')['href']

从字符串中剪切特定部分，删除HTML标记。[网页抓取]

相关问题更多 >

编程相关推荐

热门问题

热门文章

从字符串中剪切特定部分，删除HTML标记。[网页抓取]

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >