Python正则表达式问题[\S\S]*vs\d*

2024-09-30 02:30:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图在Python2.7中找到并替换某个字符串。这是我的字符串(原始显示):

\n\n\nTOSS UP\n\n\n\n1. MATH Short Answer Pablo walks 4 miles north, 6 miles east, and then 2 miles north again. In simplest form, how many miles is he from his starting point?\n\n\n\nANSWER: 6\n\n\n\nBONUS\n\n\n\n1. MATH Short Answer Evaluate the limit as x approaches infinity of x times the quantity negative 1 plus e to the 1 over x.\n\n\n\nANSWER: 1\n\n\n\nTOSS UP\n\n\n\n2. CHEMISTRY Multiple Choice Which of the following is NOT a characteristic of amines?\n\n\n\nW) A fully protonated amine is called an ammonium ion\n\nX) Amines can function as Br\xc3\xb8nsted bases\n\nY) The VSEPR geometry of the nitrogen atom is trigonal planar\n\nZ) Amines can be a hydrogen bond acceptor\n\n\n\nANSWER: Y) The VSEPR geometry of the nitrogen atom is trigonal planar\n\n\n\nBONUS\n\n\n\n2. CHEMISTRY Multiple Choice Of the following elements in their monatomic gaseous states, which has the lowest electron affinity?\n\n\n\nW) BoronX) CarbonY) NitrogenZ) OxygenANSWER: Y) NITROGEN\n\n\n

我用这个正则表达式来搜索它,然后做一些替换:

searchString = (
    r"(TOSS\-UP|TOSSUP|TOSS\s*UP)\s*"
    r"(?P<questionNum>\d{1,2})[\.\)]\s*(?P<category>[A-Z ]+)\s*"
    r"(?i)(Short Answer|Multiple Choice)\s*(?P<tossupQ>[\S\s]*)"
    r"ANSWER\:\s*(?P<tossupA>[\S\s]*)"

    r"\s*BONUS\s*"
    r"(?P<questionNumBonus>\d{1,2})[\.\)]\s*(?P<categoryBonus>[A-Z ]+)\s*"
    r"(?i)(Short Answer|Multiple Choice)\s*(?P<bonusQ>[\S\s]*)"
    r"ANSWER\:(?P<bonusA>[\S\s]*)"
)

我得到的结果是:

{
    "category": 4,
    "questionNum": 1,
    "tossupQ": "Pablo walks 4 miles north, 6 miles east, and then 2 miles north again. In simplest form, how many miles is he from his starting point?\n\n\n\nANSWER: 6\n\n\n\nBONUS\n\n\n\n1. MATH  Short Answer  Evaluate the limit as x approaches infinity of x times the quantity negative 1 plus e to the 1 over x.\n\n\n\nANSWER: 1\n\n\n\nTOSS UP\n\n\n\n2. CHEMISTRY  Multiple Choice  Which of the following is NOT a characteristic of amines?\n\n\n\nW) A fully protonated amine is called an ammonium ion\n\nX) Amines can function as Br\xc3\xb8nsted bases\n\nY) The VSEPR geometry of the nitrogen atom is trigonal planar\n\nZ) Amines can be a hydrogen bond acceptor",
    "tossupA": "Y) The VSEPR geometry of the nitrogen atom is trigonal planar",
    "bonusQ": "Of the following elements in their monatomic gaseous states, which has the lowest electron affinity?\n\n\n\nW) BoronX) CarbonY) NitrogenZ) Oxygen",
    "bonusA": "Y) NITROGEN"
},

但是,当我将行r"ANSWER\:\s*(?P<tossupA>[\S\s]*)"更改为r"ANSWER\:\s*(?P<tossupA>[\d]*)"时,我得到以下结果:

{
    "category": 4,
    "questionNum": 1,
    "tossupQ": "Pablo walks 4 miles north, 6 miles east, and then 2 miles north again. In simplest form, how many miles is he from his starting point?",
    "tossupA": "6",
    "bonusQ": "Evaluate the limit as x approaches infinity of x times the quantity negative 1 plus e to the 1 over x.\n\n\n\nANSWER: 1\n\n\n\nTOSS UP\n\n\n\n2. CHEMISTRY  Multiple Choice  Which of the following is NOT a characteristic of amines?\n\n\n\nW) A fully protonated amine is called an ammonium ion\n\nX) Amines can function as Br\xc3\xb8nsted bases\n\nY) The VSEPR geometry of the nitrogen atom is trigonal planar\n\nZ) Amines can be a hydrogen bond acceptor\n\n\n\nANSWER: Y) The VSEPR geometry of the nitrogen atom is trigonal planar\n\n\n\nBONUS\n\n\n\n2. CHEMISTRY  Multiple Choice  Of the following elements in their monatomic gaseous states, which has the lowest electron affinity?\n\n\n\nW) BoronX) CarbonY) NitrogenZ) Oxygen",
    "bonusA": "Y) NITROGEN"
},

为什么tossupA不匹配[\S\S]*,而只匹配\d*?任何帮助都将不胜感激!你知道吗


Tags: oftheisasmultiplecanfollowingup
1条回答
网友
1楼 · 发布于 2024-09-30 02:30:50

原因是你在使用贪婪的量词。如果不限制Answer:后跟数字,则允许tossupQ匹配较长的字符串。因此,tossupQ包含了直到最后一个Answer:为止的所有问题和答案。你知道吗

当要求Answer:后跟数字时,tossupA只能匹配第一个答案,tossupQ必须提前停止才能允许此匹配。你知道吗

您可以通过更改为非贪婪量词来解决这个问题:*?。这将使它们匹配与模式其余部分一致的最短字符串,而不是最长的字符串。你知道吗

searchString = (
    r"(TOSS\-UP|TOSSUP|TOSS\s*UP)\s*"
    r"(?P<questionNum>\d{1,2})[\.\)]\s*(?P<category>[A-Z ]+)\s*"
    r"(?i)(Short Answer|Multiple Choice)\s*(?P<tossupQ>[\S\s]*?)"
    r"ANSWER\:\s*(?P<tossupA>[\S\s]*?)"

    r"\s*BONUS\s*"
    r"(?P<questionNumBonus>\d{1,2})[\.\)]\s*(?P<categoryBonus>[A-Z ]+)\s*"
    r"(?i)(Short Answer|Multiple Choice)\s*(?P<bonusQ>[\S\s]*?)"
    r"ANSWER\:(?P<bonusA>[\S\s]*)"
)

顺便说一句,[\S\s].相同。如果希望匹配跨越多行,请使用re.DOTALL标志允许它匹配换行符。你知道吗

相关问题 更多 >

    热门问题