在字符串中找到重复的单词并打印它们问题的回答

在字符串中找到重复的单词并打印它们

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

好吧，既然你坚持用正则表达式来做，你应该努力在一个调用中完成，这样你就不会受到上下文切换的惩罚。最好的方法是编写一个模式来捕获不包含数字的所有名/姓，用逗号分隔，让正则表达式引擎捕获所有这些名字，然后迭代匹配项，最后将它们映射到字典，以便可以将它们拆分为姓氏=&gt；名字映射： <pre><code>import collections import re text = "Assaf Spanier, Assaf Din, Yo9ssi Levi, Yoram bibe9rman, David levi, " \ "Bibi Netanyahu, Amnon Levi, Ehud sPanier, Barak Spa7nier, Sara Neta4nyahu" full_name = re.compile(r"(?:^|\s|,)([^\d\s]+)\s+([^\d\s]+)(?=>$|,)") # compile the pattern matches = collections.OrderedDict() # store for the last=>first name map preserving order for match in full_name.finditer(text): first_name = match.group(1) print(first_name) # print the first name to match your desired output last_name = match.group(2).title() # capitalize the last name for case-insensitivity if last_name in matches: # repeated last name matches[last_name].append(first_name) # add the first name to the map else: # encountering this last name for the first time matches[last_name] = [first_name] # initialize the map for this last name print("========") # print the separator... # finally, print all the repeated last names to match your format for k, v in matches.items(): if len(v) > 1: # print only those with more than one first name attached print(k) </code></pre> 这会给你： ^{pr2}$ 另外，在<code>matches</code>中有完整的姓氏=&gt；名字匹配。在 说到图案，让我们一块一块地分解： <pre> (?:^|\s|,) - match the beginning of the string, whitespace or a comma (non-capturing) ([^\d\,]+) - followed by any number of characters that are not not digits or whitespace (capturing) \s+ - followed by one or more whitespace characters (non-capturing) ([^\d\s]+) - followed by the same pattern as for the first name (capturing) (?=>$|,) - followed by a comma or end of the string (look-ahead, non-capturing) </pre> 当我们迭代匹配项时，<code>match</code>对象中会引用这两个捕获的组（名字和姓氏）。别紧张。在

在字符串中找到重复的单词并打印它们

1 个回答

相关Python问题