BeautifulSoup如何从img src获取url当它有../。。？

2024-10-06 11:18:17 发布

男 | 程序猿一只，喜欢编程写python代码。

所以让我们假设我试图获得一个特定图像的链接，比如：

from bs4 import BeautfiulSoup
import urlparse

soup = BeautifulSoup("http://examplesite.com")
for image in soup.findAll("img"):
    srcd = urlparse.urlparse(src)
    path = srcd.path # gets the path
    fn = os.path.basename(path) # gets filename

# lets say the webpage i was scraping had their images like this:
# <img src="../..someimage.jpg" />

有没有什么简单的方法可以从中获取完整的url？还是必须使用正则表达式？在

Tags： the path from 图像 import src img 链接

1条回答

网友

1楼 · 发布于 2024-10-06 11:18:17

使用urlparse.urljoin：

>>> import urlparse
>>> base_url = "http://example.com/foo/"
>>> urlparse.urljoin(base_url, "../bar")
'http://example.com/bar'
>>> urlparse.urljoin(base_url, "/baz")
'http://example.com/baz'

BeautifulSoup如何从img src获取url当它有../。。？

相关问题更多 >

编程相关推荐

热门问题

热门文章

BeautifulSoup如何从img src获取url当它有../。。？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >