我有一个很长的字符串,需要分组解析,但需要更多地控制它。在
import re
RAW_Data = "Name Multiple Words Testing With 1234 Numbers and this stuff* ((Bla Bla Bla (Bla Bla) A40 & A41)) Name Multiple Words Testing With 3456 Numbers and this stuff2* ((Bla Bla Bla (Bla Bla) A42 & A43)) Name Multiple Words Testing With 78910 Numbers and this stuff3* ((Bla Bla Bla (Bla Bla) A44 & A45)) Name Multiple Words Testing With 1234 Numbers and this stuff4* ((Bla Bla Bla (Bla Bla) A46 & A47)) Name Multiple Words Testing With 1234 Numbers and this stuff5* ((Bla Bla Bla (Bla Bla) A48 & A49)) Name Multiple Words Testing With 1234 Numbers and this stuff6* ((Bla Bla Bla (Bla Bla) A50 & A51)) Name Multiple Words Testing With 1234 Numbers and this stuff7* ((Bla Bla Bla (Bla Bla) A52 & A53)) Name Multiple Words Testing With 1234 Numbers and this stuff8* ((Bla Bla Bla (Bla Bla) A54 & A55)) Name Multiple Words Testing With 1234 Numbers and this stuff9* ((Bla Bla Bla (Bla Bla) A56 & A57)) Name Multiple Words Testing With 1234 Numbers and this stuff10* ((Bla Bla Bla (Bla Bla) A58 & A59)) Name Multiple Words Testing With 1234 Numbers and this stuff11* ((Bla Bla Bla (Bla Bla) A60 & A61)) Name Multiple Words Testing With 1234 Numbers and this stuff12* ((Bla Bla Bla (Bla Bla) A62 & A63)) Name Multiple Words Testing With 1234 Numbers and this stuff13* ((Bla Bla Bla (Bla Bla) A64 & A65)) Name Multiple Words Testing With 1234 Numbers and this stuff14* ((Bla Bla Bla (Bla Bla) A66 & A67)) Name Multiple Words Testing With 1234 Numbers and this stuff15* ((Bla Bla Bla (Bla Bla) A68 & A69)) Name Multiple Words Testing With 1234 Numbers and this stuff16*"
fromnode = re.findall('(.*?)(?=\*\s)', RAW_Data)
print fromnode
del fromnode
del RAW_Data
结果是:“用1234个数字和这个东西命名多个单词测试”、“,”((Bla-Bla(Bla-Bla)A40&A41))用3456个数字和这个stuff2'命名多个单词测试。。。。。。。。等等。
我似乎不能只捕获诸如“用3456个数字命名多个单词测试”之类的字符串,而忽略诸如“((Bla-Bla(Bla-Bla)A40&A41))”之类的字符串。任何帮助都将不胜感激。在
你可以用
模式(see demo)匹配:
\*
-一个文本星号\s*
-零个或多个空白\({2}
-正好2个左括号.*?
-除换行符之外的零个或多个字符(注意:如果需要跨多行匹配,请添加re.S
标志),直到第一个字符为止\){2}
-双右括号\s*
-0+空格。在另外:same, but unrolled (thus, a bit more efficient) regex:
^{pr2}$见IDEONE demo:
更新
用于
re.findall
的正则表达式:参见regex demo
它的样子吓坏了吗?它只是更简单的^{} 的展开版本。在
参见IDEONE demo。在
相关问题 更多 >
编程相关推荐