>>> import re
>>> x = 'the meaning\nof life'
>>> re.sub("([,\w])\n(\w)", "\1 \2", x)
'the meanin\x01 \x02f life'
>>> re.sub("([,\w])\n(\w)", "\\1 \\2", x)
'the meaning of life'
>>> re.sub("([,\w])\n(\w)", r"\1 \2", x)
'the meaning of life'
>>>
If you're putting this in a string within a program, you may actually need to use four backslashes (because the string parser will remove two of them when "de-escaping" it for the string, and then the regex needs two for an escaped regex backslash).
As stated earlier, regular expressions use the backslash character ('\') to indicate special forms or to allow special characters to be used without invoking their special meaning. This conflicts with Python's usage of the same character for the same purpose in string literals.
Let's say you want to write a RE that matches the string \section, which might be found in a LaTeX file. To figure out what to write in the program code, start with the desired string to be matched. Next, you must escape any backslashes and other metacharacters by preceding them with a backslash, resulting in the string \\section. The resulting string that must be passed to re.compile() must be \\section. However, to express this as a Python string literal, both backslashes must be escaped again.
正如brittenb所建议的,在这种情况下,您不需要RegEx:
>>> x = 'the meaning\nof life'
>>> x.replace("\n", " ")
'the meaning of life'
>>>
你需要这样转义
\
:如果不转义,则输出为
\1
,因此:这就是为什么我们需要使用
'\\\\'
或r'\\'
在Python正则表达式中显示信号\
。你知道吗但是关于这个,从this answer:
和the document:
正如brittenb所建议的,在这种情况下,您不需要RegEx:
使用原始字符串文字;Python字符串文字语法和regex都解释反斜杠;
\1
在Python字符串文字中解释为八进制转义,但在原始字符串文字中不解释:另一种方法是将所有反斜杠加倍,这样它们就可以到达regex引擎。你知道吗
请参见Python regex HOWTO的Backslash plague section。你知道吗
演示:
使用换行符拆分可能更容易;使用^{} method ,然后使用^{} 重新连接空格:
但无可否认,这并不能区分单词之间的新行和其他地方的额外新行。你知道吗
相关问题 更多 >
编程相关推荐