Python：与urljoin的混淆

from urllib.parse import urljoin >>> urljoin('some', 'thing') 'thing' >>> urljoin('http://some', 'thing') 'http://some/thing' >>> urljoin('http://some/more', 'thing') 'http://some/thing' >>> urljoin('http://some/more/', 'thing') # just a tad / after 'more' 'http://some/more/thing' urljoin('http://some/more/', '/thing') 'http://some/thing'

2条回答

网友

1楼 · 编辑于 2024-09-19 20:56:40

最好的方法（对我来说）是第一个参数，base就像你在浏览器中的页面一样。第二个参数url是该页上锚的href。结果是最后一个url，如果您单击它，您将被定向到该url。

>>> urljoin('some', 'thing')
'thing'

这个很有道理，请给我描述一下。尽管人们希望基础包括一个方案和域。

>>> urljoin('http://some', 'thing')
'http://some/thing'

如果你在一个vhost some上，并且有一个类似<a href='thing'>Foo</a>的锚，那么链接将带你到http://some/thing

>>> urljoin('http://some/more', 'thing')
'http://some/thing'

我们在some/more上，因此thing的相对链接将带我们到/some/thing

>>> urljoin('http://some/more/', 'thing') # just a tad / after 'more'
'http://some/more/thing'

在这里，我们不在some/more上，我们在some/more/上，这是不同的。现在，我们的相对链接将带我们到some/more/thing

>>> urljoin('http://some/more/', '/thing')
'http://some/thing'

最后。如果在some/more/上，并且该ref是/thing，则将链接到some/thing。

网友

2楼 · 编辑于 2024-09-19 20:56:40

urllib.parse.urljoin(base, url)
If url is an absolute URL (that is, starting with //, http://, https://, ...), the url’s host name and/or scheme will be present in the result. For example:

>>> urljoin('https://www.google.com', '//www.microsoft.com')
'https://www.microsoft.com'
>>>

否则，urllib.parse.urljoin（base，url）将

Construct a full (“absolute”) URL by combining a “base URL” (base) with another URL (url). Informally, this uses components of the base URL, in particular the addressing scheme, the network location and (part of) the path, to provide missing components in the relative URL.

>>> urlparse('http://a/b/c/d/e')
ParseResult(scheme='http', netloc='a', path='/b/c/d/e', params='', query='', fragment='')
>>> urljoin('http://a/b/c/d/e', 'f')
>>>'http://a/b/c/d/f'
>>> urlparse('http://a/b/c/d/e/')
ParseResult(scheme='http', netloc='a', path='/b/c/d/e/', params='', query='', fragment='')
>>> urljoin('http://a/b/c/d/e/', 'f')
'http://a/b/c/d/e/f'
>>>

它获取第一个参数（base）的路径，去掉最后一个/之后的部分，并与第二个参数（url）连接。

如果url以/开头，则它将scheme和base的netloc与url连接起来

>>>urljoin('http://a/b/c/d/e', '/f')
'http://a/f'

相关问题更多 >

编程相关推荐

热门问题

热门文章