我有一个文件,每行有一个句子。我试图阅读文件,并使用正则表达式搜索句子是否是一个疑问句,并从句子中提取wh单词,并根据它在第一个文件中出现的顺序将它们保存回另一个文件中。在
这就是我目前所掌握的。。在
def whWordExtractor(inputFile):
try:
openFileObject = open(inputFile, "r")
try:
whPattern = re.compile(r'(.*)who|what|how|where|when|why|which|whom|whose(\.*)', re.IGNORECASE)
with openFileObject as infile:
for line in infile:
whWord = whPattern.search(line)
print whWord
# Save the whWord extracted from inputFile into another whWord.txt file
# writeFileObject = open('whWord.txt','a')
# if not whWord:
# writeFileObject.write('None' + '\n')
# else:
# whQuestion = whWord
# writeFileObject.write(whQuestion+ '\n')
finally:
print 'Done. All WH-word extracted.'
openFileObject.close()
except IOError:
pass
The result after running the code above: set([])
我有什么地方做错了吗?如果有人能给我指出来,我将不胜感激。在
将
'(.*)who|what|how|where|when|why|which|whom|whose(\.*)'
更改为".*(?:who|what|how|where|when|why|which|whom|whose).*\."
像这样:
不确定这是否是你要找的,但你可以试试这样的方法:
相关问题 更多 >
编程相关推荐