我试图在Python2.7中找到并替换某个字符串。这是我的字符串(原始显示):
\n\n\nTOSS UP\n\n\n\n1. MATH Short Answer Pablo walks 4 miles north, 6 miles east, and then 2 miles north again. In simplest form, how many miles is he from his starting point?\n\n\n\nANSWER: 6\n\n\n\nBONUS\n\n\n\n1. MATH Short Answer Evaluate the limit as x approaches infinity of x times the quantity negative 1 plus e to the 1 over x.\n\n\n\nANSWER: 1\n\n\n\nTOSS UP\n\n\n\n2. CHEMISTRY Multiple Choice Which of the following is NOT a characteristic of amines?\n\n\n\nW) A fully protonated amine is called an ammonium ion\n\nX) Amines can function as Br\xc3\xb8nsted bases\n\nY) The VSEPR geometry of the nitrogen atom is trigonal planar\n\nZ) Amines can be a hydrogen bond acceptor\n\n\n\nANSWER: Y) The VSEPR geometry of the nitrogen atom is trigonal planar\n\n\n\nBONUS\n\n\n\n2. CHEMISTRY Multiple Choice Of the following elements in their monatomic gaseous states, which has the lowest electron affinity?\n\n\n\nW) BoronX) CarbonY) NitrogenZ) OxygenANSWER: Y) NITROGEN\n\n\n
我用这个正则表达式来搜索它,然后做一些替换:
searchString = (
r"(TOSS\-UP|TOSSUP|TOSS\s*UP)\s*"
r"(?P<questionNum>\d{1,2})[\.\)]\s*(?P<category>[A-Z ]+)\s*"
r"(?i)(Short Answer|Multiple Choice)\s*(?P<tossupQ>[\S\s]*)"
r"ANSWER\:\s*(?P<tossupA>[\S\s]*)"
r"\s*BONUS\s*"
r"(?P<questionNumBonus>\d{1,2})[\.\)]\s*(?P<categoryBonus>[A-Z ]+)\s*"
r"(?i)(Short Answer|Multiple Choice)\s*(?P<bonusQ>[\S\s]*)"
r"ANSWER\:(?P<bonusA>[\S\s]*)"
)
我得到的结果是:
{
"category": 4,
"questionNum": 1,
"tossupQ": "Pablo walks 4 miles north, 6 miles east, and then 2 miles north again. In simplest form, how many miles is he from his starting point?\n\n\n\nANSWER: 6\n\n\n\nBONUS\n\n\n\n1. MATH Short Answer Evaluate the limit as x approaches infinity of x times the quantity negative 1 plus e to the 1 over x.\n\n\n\nANSWER: 1\n\n\n\nTOSS UP\n\n\n\n2. CHEMISTRY Multiple Choice Which of the following is NOT a characteristic of amines?\n\n\n\nW) A fully protonated amine is called an ammonium ion\n\nX) Amines can function as Br\xc3\xb8nsted bases\n\nY) The VSEPR geometry of the nitrogen atom is trigonal planar\n\nZ) Amines can be a hydrogen bond acceptor",
"tossupA": "Y) The VSEPR geometry of the nitrogen atom is trigonal planar",
"bonusQ": "Of the following elements in their monatomic gaseous states, which has the lowest electron affinity?\n\n\n\nW) BoronX) CarbonY) NitrogenZ) Oxygen",
"bonusA": "Y) NITROGEN"
},
但是,当我将行r"ANSWER\:\s*(?P<tossupA>[\S\s]*)"
更改为r"ANSWER\:\s*(?P<tossupA>[\d]*)"
时,我得到以下结果:
{
"category": 4,
"questionNum": 1,
"tossupQ": "Pablo walks 4 miles north, 6 miles east, and then 2 miles north again. In simplest form, how many miles is he from his starting point?",
"tossupA": "6",
"bonusQ": "Evaluate the limit as x approaches infinity of x times the quantity negative 1 plus e to the 1 over x.\n\n\n\nANSWER: 1\n\n\n\nTOSS UP\n\n\n\n2. CHEMISTRY Multiple Choice Which of the following is NOT a characteristic of amines?\n\n\n\nW) A fully protonated amine is called an ammonium ion\n\nX) Amines can function as Br\xc3\xb8nsted bases\n\nY) The VSEPR geometry of the nitrogen atom is trigonal planar\n\nZ) Amines can be a hydrogen bond acceptor\n\n\n\nANSWER: Y) The VSEPR geometry of the nitrogen atom is trigonal planar\n\n\n\nBONUS\n\n\n\n2. CHEMISTRY Multiple Choice Of the following elements in their monatomic gaseous states, which has the lowest electron affinity?\n\n\n\nW) BoronX) CarbonY) NitrogenZ) Oxygen",
"bonusA": "Y) NITROGEN"
},
为什么tossupA不匹配[\S\S]*,而只匹配\d*?任何帮助都将不胜感激!你知道吗
原因是你在使用贪婪的量词。如果不限制
Answer:
后跟数字,则允许tossupQ
匹配较长的字符串。因此,tossupQ
包含了直到最后一个Answer:
为止的所有问题和答案。你知道吗当要求
Answer:
后跟数字时,tossupA
只能匹配第一个答案,tossupQ
必须提前停止才能允许此匹配。你知道吗您可以通过更改为非贪婪量词来解决这个问题:
*?
。这将使它们匹配与模式其余部分一致的最短字符串,而不是最长的字符串。你知道吗顺便说一句,
[\S\s]
与.
相同。如果希望匹配跨越多行,请使用re.DOTALL
标志允许它匹配换行符。你知道吗相关问题 更多 >
编程相关推荐