python使用初始字符列表从其他列表检索完整单词?

2024-10-02 08:15:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图使用缩写词列表来选择&;检索由其初始字符序列标识的对应完整单词:

shortwords = ['appe', 'kid', 'deve', 'colo', 'armo']    

fullwords = ['appearance', 'armour', 'colored', 'developing', 'disagreement', 'kid', 'pony', 'treasure']

使用一个缩短的单词尝试此正则表达式匹配:

import re

shortword = 'deve'

retrieved=filter(lambda i: re.match(r'{}'.format(shortword),i), fullwords)

print(retrieved*)

返回

developing

所以正则表达式匹配是有效的,但问题是如何调整代码以迭代短单词列表并检索完整单词

编辑:解决方案需要保留短词列表中的顺序


Tags: re列表序列字符单词标识kidamp
3条回答

也许用字典

# Using a dictionary 
test= 'appe is a deve arm'
shortwords = ['appe', 'deve', 'colo', 'arm', 'pony', 'disa']    
fullwords = ['appearance', 'developing', 'colored', 'armour', 'pony', 'disagreement']
#Building the dictionary 
d={}
for i in range(len(shortwords)):
    d[shortwords[i]]=fullwords[i]

# apply dictionary to test 
res=" ".join(d.get(s,s) for s in test.split()) 
# print test data after dictionary mapping
print(res) 

你的问题文本似乎表明你在每个单词的开头寻找你的短词。那应该很容易:

matched_words = [word for word in fullwords if any(word.startswith(shortword) for shortword in shortwords]

如果出于某种原因(不太可能更快)希望对此进行正则化,则可以进行大量替换:

regex_alternation = '|'.join(re.escape(shortword) for shortword in shortwords)
matched_words = [word for word in fullwords if re.match(rf"^{regex_alternation}", word)]

或者,如果您的短词是始终四个字符,您可以将前四个字符切掉:

shortwords = set(shortwords)  # sets have O(1) lookups so this will save
                              # a significant amount of time if either shortwords
                              # or longwords is long

matched_words = [word for word in fullwords if word[:4] in shortwords]

这是一种方法:

shortwords = ['appe', 'deve', 'colo', 'arm', 'pony', 'disa']
fullwords = ['appearance', 'developing', 'colored', 'armour', 'pony', 'disagreement']
        
# Dict comprehension
words = {short:full for short, full in zip(shortwords, fullwords)}

#Solving problem
keys = ['deve','arm','pony']
values = [words[key] for key in keys]
        
print(values)

这是一个经典的键值问题。使用字典,或长期考虑熊猫。

相关问题 更多 >

    热门问题