匹配行尾忽略unicode字符

2024-07-04 06:49:08 发布

男 | 程序猿一只，喜欢编程写python代码。

我有一个小的Python脚本，它将从文档a导入的单词列表与文档B中的一组行尾进行比较，以便将不符合这些规则的单词列表复制到文档C中。示例：

A (word list): 
salir
entrar
leer

B (line endings list):
ir
ar

C (those from A that do not match B):
leer

一般来说，它工作得很好，但我意识到它不适用于包含Unicode字符作为ó的行尾-没有错误消息，一切看起来都很顺利，但列表C仍然包含以ó结尾的单词

以下是我的代码摘录：

inputobj = codecs.open(A, "r")
ruleobj = codecs.open(B, "r")
nomatch = codecs.open(C, "w")

inputtext = inputobj.readlines()
ruletext = ruleobj.readlines()

for line in inputtext:
    x = 0
    line = line.strip()
    for rule in ruletext:
        rule = rule.strip()
        if line.endswith(rule):
            print "rule", rule, " in line", line
            x= x+1
    if x == 0:
        nomatchlist.append(line)

for i in notmatchlist:
    print >> nomatch, i

Tags： in 文档列表 for line open 单词 rule

1条回答

网友

1楼 · 发布于 2024-07-04 06:49:08

我在本地尝试了一些代码。它很适合ó'. 你能检查一下空调吗；B的编码相同吗

匹配行尾忽略unicode字符

相关问题更多 >

编程相关推荐

热门问题

热门文章

匹配行尾忽略unicode字符

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >