尝试将只有两个或更多重复元音的单词打印到文本文件中

2024-06-20 14:55:46 发布

您现在位置:Python中文网/ 问答频道 /正文

import re
twovowels=re.compile(r".*[aeiou].*[aeiou].*", re.I)
nonword=re.compile(r"\W+", re.U)
text_file = open("twoVoweledWordList.txt", "w")
file = open("FirstMondayArticle.html","r")
for line in file:
    for word in nonword.split(line):
        if twovowels.match(word): print word
        text_file.write('\n' + word)
text_file.close()

file.close()

这是我的python代码,我试图只打印出现两个或更多元音的单词。当我运行此代码时,它会将所有内容(包括没有元音的单词和数字)打印到我的文本文件中。但是pythonshell向我展示了所有有两个或更多元音的单词。那我该怎么改变呢?你知道吗


Tags: textinreforcloselineopen单词
3条回答

我建议另一种更简单的方法,不要使用re

def twovowels(word):
    count = 0
    for char in word.lower():
        if char in "aeiou":
            count = count + 1
            if count > 1:
                return True
    return False

with open("FirstMondayArticle.html") as file,
        open("twoVoweledWordList.txt", "w") as text_file:
    for line in file:
        for word in line.split():
            if twovowels(word):
                print word
                text_file.write(word + "\n")

因为它在if条件之外。代码行应该是这样的:

for line in file:
    for word in nonword.split(line):
        if twovowels.match(word):
            print word
            text_file.write('\n' + word)
text_file.close()

file.close()

这里有一个sample program on Tutorialspoint显示上面的代码是正确的。你知道吗

你可以用str.translate公司比较长度。如果删除字母后长度差大于1,则至少有两个元音:

with open("FirstMondayArticle.html") as f, open("twoVoweledWordList.txt", "w") as out:
    for line in file:
        for word in line.split():
            if len(word) - len(word.lower().translate(None,"aeiou")) > 1:
                out.write("{}\n".format(word.rstrip()))

在您自己的代码中,您总是编写单词,因为text_file.write('\n' + word)在if块之外。关于为什么不应在一行上有多个语句的一个很好的教训,您的代码相当于:

   if twovowels.match(word):
        print(word)
    text_file.write('\n' + word) # <- outside the if

使用if的代码位于正确的位置,对命名约定进行了一些更改,在赋值之间添加了一些空格,并使用with为您关闭文件:

import re
with open("FirstMondayArticle.html") as f, open("twoVoweledWordList.txt", "w") as out:
    two_vowels = re.compile(r".*[aeiou].*[aeiou].*", re.I)
    non_word = re.compile(r"\W+", re.U)
    for line in f:
        for word in non_word.split(line):
            if two_vowels.match(word):
                print(word)
                out.write("{}\n".format(word.rstrip()))  

相关问题 更多 >