我从一个凌乱的HTML文件中搜索以下行:
<span id="fooPack1_xpl01_name11">150.00 FTL</span>
<span id="fooPack1_xpl02_name11">350.00 FTL</span>
<span id="fooPack1_xpl03_name11">250.00 FTL</span>
<span id="fooPack1_xpl04_name11">230.00 FTL</span>
我使用beauthulsoup和re来搜索和查找字符串:
^{pr2}$ 显然,这个字符串的共同部分是在开始和结尾,部分总是在中间。如何重新构造搜索模式,使其搜索“fooPack1_xpl”+(不同字符串)+“uname11”谢谢。在
//编辑//
当我询问以下问题时:
<span id="FullView1_spl02_Stack_4">03/04/12</span>
<span id="FullView1_spl03_Stack_4">01/03/11</span>
<span id="FullView1_spl04_Stack_4">02/25/02</span>
<span id="FullView1_spl05_Stack_4">07/16/04</span>
<span id="FullView1_spl01_Stack32">999.00 SPL</span>
<span id="FullView1_spl02_Stack82">150.00 XPP</span>
<span id="FullView1_spl03_Stack82">350.00 XPP</span>
<span id="FullView1_spl04_Stack82">450.00 XPP</span>
<span id="FullView1_spl05_Stack82">550.00 XPP</span>
<span id="FullView1_spl06_Stack82">650.00 XPP</span>
<span id="FullView1_spl07_Stack22">888.00 SPL</span>
<span id="FullView1_spl202_stckFriendName">Red Car</span>
<span id="FullView1_spl203_stckFriendName">Green Car</span>
<span id="FullView1_spl204_stckFriendName">Blue Car</span>
有:
foo=soup.findAll('span', id=re.compile(r'FullView1_spl\d+_stack82'))
我得到以下结果:
<span id="FullView1_spl204_stckFriendName">Blue Car</span>
<span id="FullView1_spl02_Stack82">150.00 XPP</span>
<span id="FullView1_spl03_Stack82">350.00 XPP</span>
<span id="FullView1_spl04_Stack82">450.00 XPP</span>
<span id="FullView1_spl05_Stack82">550.00 XPP</span>
<span id="FullView1_spl06_Stack82">650.00 XPP</span>
很明显,我不需要被探测到。所以这是唯一的问题。在
你快到了。您想搜索
fooPack1_xpl
,后面跟着_name11
,那么如何:请注意,我只是在您期望的数字位置加了一个
\d+
,另外还有您要搜索的文本字符串。在相关问题 更多 >
编程相关推荐