在Python中如何设置正则表达式来删除url末尾的时间戳？ - 问答 - Python中文网

在Python中如何设置正则表达式来删除url末尾的时间戳？

2024-09-23 04:28:34 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我有一个文本文件包含大量的网址，但他们有时间戳在最后，这对我来说有点多余。在

    http://techcrunch.com/2012/02/10/vevo-ceo-tries-to-explain-their-hypocritical-act-of-piracy-at-sundance/)16:55:40
    http://techcrunch.com/2012/04/30/edmodo-hits-7m/)15:18:45

我想在python中使用正则表达式可以帮助我摆脱它，但同时我可以使用Python split and replace操作，它可以删除末尾的时间戳，其输出与下面给定的url类似

^{pr2}$

现在我的问题是，正则表达式样式或python字符串方法在空间和时间方面的性能会更好，还是有其他更好的方法。在

Tags： to 方法 com http 时间 act 网址文本文件

3条回答

网友

1楼 · 编辑于 2024-09-23 04:28:34

另一种可能性是：

for line in lines:
    url = line.rsplit('/', 1)[0]

网友

2楼 · 编辑于 2024-09-23 04:28:34

这应该比遍历每一行都快：

import re

my_str = "http://techcrunch.com/2012/04/30/edmodo-hits-7m/)15:18:45"
re.findall(r'([\w./:\d-]+)/\)\d\d:\d\d:\d\d', my_str)

网友

3楼 · 编辑于 2024-09-23 04:28:34

我不会用正则表达式来完成这样的任务，这太容易了

for line in lines:
    print line.split(')')[0]

或者如果url包含)：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章