<p>正则表达式是一个硬性要求吗,因为您需要将它与现有正则表达式相结合?如果没有,标准库中有一个简单的工具:</p>
<pre><code>from urllib.parse import urlparse
urls = [
'https://www.amazon.com/Technology-Ventures-Enterprise-Thomas-Byers/dp/0073523429',
'http://www.interactivedynamicvideo.com/',
'http://www.nytimes.com/2007/11/07/movies/07stein.html?_r=0',
'http://evonomics.com/advertising-cannot-maintain-internet-heres-solution/',
'HTTPS://github.com/keppel/pinn',
'Http://phys.org/news/2015-09-scale-solar-youve.html',
'https://iot.seeed.cc',
'http://www.bfilipek.com/2016/04/custom-deleters-for-c-smart-pointers.html',
'http://beta.crowdfireapp.com/?beta=agnipath',
'https://www.valid.ly?param',
'http://css-cursor.techstream.org',
]
domains = [urlparse(url).netloc for url in urls]
print(domains)
</code></pre>
<p>我想正则表达式更快:</p>
<pre><code>>>> netloc = re.compile(r'^https?://([^/?^]+)', flags=re.I)
>>> %timeit [netloc.match(url).group(1) for url in urls]
5.66 µs ± 97.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit [urlparse(url).netloc for url in urls]
23.3 µs ± 3.68 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
</code></pre>