从动态u - 问答 - Python中文网

从动态u

2024-09-25 00:29:56 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我的url看起来像：

http://www.example.com/blah/prod/4/x/blah.html

现在如果这个页面有子页面，它看起来像：

http://www.example.com/blah/prod/4_2343/x/blah.html

也就是说，在/prod/4之后会有一个下划线，而不是另一个数字。在

同样，如果此页面有子页面，它将是：

http://www.example.com/blah/prod/4_2343_234/x/blah.html

我需要把我放的所有文字都拿出来？？？以下：

/生产/？？？？？？？/十/废话.html在

我该怎么做？在

Tags： com http url example html www 数字 prod

2条回答

网友

1楼 · 编辑于 2024-09-25 00:29:56

比如这样。匹配模式prod/？？？/x/blah，在哪里？？？是由数字和下划线组成的任何字符串：

import re
pattern = re.compile('prod/([\d_]+)/x/blah')
query   = "http://www.example.com/blah/prod/4_2343_234/x/blah.html"
result  = pattern.search(query).group(1)
print result

网友

2楼 · 编辑于 2024-09-25 00:29:56

import urlparse
url = 'http://www.example.com/blah/prod/4_2343_234/x/blah.html'

urlparse.urlsplit(url).path.split('/')[3]
# returns '4_2343_234'

相关问题更多 >

编程相关推荐

热门问题

热门文章