删除读取之前或之后的空行（）

import re import robotexclusionrulesparser as rerp p = rerp.RobotExclusionRulesParser() list = [] with open('robots.txt') as f: s = f.read() if not re.match(r'^\s*$', s): list.append(s) p.parse(list) print(p)

Traceback (most recent call last): File "test.py", line 10, in <module> p.parse(list) File "/usr/local/lib/python2.7/dist-packages/robotexclusionrulesparser.py", line 530, in parse s = s.decode("iso-8859-1") AttributeError: 'list' object has no attribute 'decode'

3条回答

网友

1楼 · 编辑于 2024-05-06 12:23:29

看看这个：

import re
lst = []
with open('robots.txt') as f:
    for line in f:
        if not re.match(r'^\s*$', line):
            lst.append(line.strip())
print(lst)

您的实际问题显然是parse方法需要str，而不是list。你知道吗

还要检查：list是保留项，不应用作变量名。你知道吗

网友

2楼 · 编辑于 2024-05-06 12:23:29

parser（）需要一个包含以“\n”结尾的行列表的字符串。你知道吗

代码如下：

import re
import robotexclusionrulesparser as rerp
p = rerp.RobotExclusionRulesParser()
lst = []

with open('robots.txt') as f:
    for line in f:
        if not re.match(r'^\s*$', line):
            lst.append(line.strip())

s = '\n'.join(lst)
p.parse(s)
print(p)

网友

3楼 · 编辑于 2024-05-06 12:23:29

Regex是您想要的，但是不要使用match，而是使用sub：

s = f.read()
s = re.sub(r'\n+', '\\n', s)

Example on Regex101

相关问题更多 >

编程相关推荐

热门问题

热门文章