如何使用正则表达式删除所有职员 - 问答 - Python中文网

如何使用正则表达式删除所有职员

2024-06-26 14:49:31 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

下面是我的txt文件中的一些行（solution、pos和gloss）的副本：

solution: (كَتَبَ kataba) [katab-u_1] 
     pos: katab/VERB_PERFECT+a/PVSUFF_SUBJ:3MS
gloss: ___ + write + he/it <verb>

我想返回'卡塔布'一词，在第一行方括号内，删除所有工作人员和行和数字每件事。我正在研究python2.7

我试着写这个代码：

pattern = re.compile("'(?P[^']+)':\s*(?P<root>[^,]*)\d+")

Tags：文件 pos txt 副本 it write he solution

1条回答

网友

1楼 · 发布于 2024-06-26 14:49:31

当你认为“我需要匹配一个模式”时，你应该认为“正则表达式”是一个很好的起点。见doco。因为输入文件是unicode，所以这有点棘手

import re
import codecs

with codecs.open("test.unicode.txt","rb", "utf-8") as f:
    words = []
    for line in f.readlines():
        matches = re.match(b"solution:.+\[(?P<word>\w+).*\]", line, flags=re.U)
        if matches:
            words.append(matches.groups()[0])

print(words)

输出：

[u'katab']

相关问题更多 >

编程相关推荐

热门问题

热门文章