我正试图用这段代码提取href
while the soup is like this
</div>
</div>
</article>
</div>
<div class="listing">
<article class="listing-item image-left" itemscope="" itemtype="https://schema.org/NewsArticle">
<div class="listing-image image-container">
<a class="image page-link" href="/mundo/venezuela/entrevista-con-el-representante-para-los-migrantes-venezolanos-eduardo-stein-425664">
<img alt="" src="/files/image_184_123/uploads/2019/10/22/5daf22f15ed09.jpeg"/>
</a>
</div>
import requests
url = "https://www.eltiempo.com/buscar?q=migrantes+venezolanos"
# Getting the webpage, creating a Response object.
response = requests.get(url)
# Extracting the source code of the page.
data = response.text
# Passing the source code to BeautifulSoup to create a BeautifulSoup object for it.
soup = BeautifulSoup(data, 'lxml')
# Extracting all the <a> tags into a list.
tags = soup.find_all('div')
# Extracting URLs from the attribute href in the <a> tags.
for tag in tags:
print(tag.get('href'))
有人能帮我吗?我在互联网上找到的所有例子都是用接近a的HREF,更容易提取
谢谢
可能你想要
data = response.html
,以及soup.find_all('a')
。如果您只需要带有href的<a>
标记,也可以使用soup.find_all('a', href=True)
(请参见BeautifulSoup getting href)相关问题 更多 >
编程相关推荐