我想从div'class':'js-interests-list-wrap js interests board js wrap'中刮下所有标签intr txt,从这个页面:https://badoo.com/profile/0266965187。我要写这个代码:
parsed_html = BeautifulSoup(html)
interese = parsed_html.findAll('div', {'class':'js-interests-list-wrap js-interests-board js-wrap'})
实际输出为:
<div class="js-interests-list-wrap js-interests-board js-wrap" data-interests-type="hon-all"> <span class="js-intr intr"> <i class="intr-ico intr-ico--fashion"></i><!-- --><span class="intr-txt">Fotografii de modă</span> </span><span class="js-intr intr"> <i class="intr-ico intr-ico--food"></i><!-- --><span class="intr-txt">Mâncare de casă</span> </span><span class="js-intr intr"> <i class="intr-ico intr-ico--fashion"></i><!-- --><span class="intr-txt">Reviste</span> </span><span class="js-intr intr"> <i class="intr-ico intr-ico--food"></i><!-- --><span class="intr-txt">Gătitul</span> </span><span class="js-intr intr"> <i class="intr-ico intr-ico--food"></i><!-- --><span class="intr-txt">Să ies în oraș la masă</span> </span><span class="js-intr intr"> <i class="intr-ico intr-ico--travel"></i><!-- --><span class="intr-txt">Călătorii în lume</span> </span><span class="js-intr intr"> <i class="intr-ico intr-ico--hobby"></i><!-- --><span class="intr-txt">Fotografii</span> </span><span class="js-intr intr"> <i class="intr-ico intr-ico--music"></i><!-- --><span class="intr-txt">Dance</span> </span><span class="js-intr intr"> <i class="intr-ico intr-ico--hobby"></i><!-- --><span class="intr-txt">Fericire</span> </span><span class="js-intr intr"> <i class="intr-ico intr-ico--other"></i><!-- --><span class="intr-txt">Dansul</span> </span><span class="js-intr intr"> <i class="intr-ico intr-ico--hobby"></i><!-- --><span class="intr-txt">Shopping</span> </span><span class="js-intr intr"> <i class="intr-ico intr-ico--hobby"></i><!-- --><span class="intr-txt">Munți</span> </span> <div class="btn btn--sm btn--white btn--ico"><i class="icon ico--etc"><a href="https://badoo.com/signup/" class="b-link"></a></i></div> </div>
只需要以下格式的文本输出:
Fotografii de modă,
Să dorm cu cineva în brațe,
Munți,
Photography,
Reviste,
Shopping,
Dance,
Fotografii,
Dansul,
Fericire,
Bucătărie,
Gătitul
怎么能这样提取
您已经获得了所需的所有
div
标记,现在需要提取所需的所有span
标记,然后用get_text
打印文本。如果你把这个代码添加到你的你将得到这个输出
相关问题 更多 >
编程相关推荐