我正在尝试用BS4解析来自站点的html内容。我得到了我的html片段,但我需要删除所有的标签类,ID,样式等等
例如:
<div class="applinks">
<div class="appbuttons">
<a href="https://geo.itunes.apple.com/ru/app/cloud-hub-file-manager-document/id972238010?mt=8&at=11l3Ss" rel="nofollow" target="_blank" title="Cloud Hub - File Manager, Document Reader, Clouds Browser and Download Manager">Загрузить</a>
<span onmouseout="jQuery('.wpappbox-8429dd98d1602dec9a9fc989204dbf7c .qrcode').hide();" onmouseover="jQuery('.wpappbox-8429dd98d1602dec9a9fc989204dbf7c .qrcode').show();">QR-Code</span>
</div>
</div>
我需要得到:
^{pr2}$我的代码:
# coding: utf-8
import requests
from bs4 import BeautifulSoup
url = "https://lifehacker.ru/2016/08/29/app-store-29-august-2016/"
r = requests.get(url)
soup = BeautifulSoup(r.content)
post_content = soup.find("div", {"class","post-content"})
print post_content
如何删除所有标记属性?在
要从报废数据中的标记中删除所有属性,请执行以下操作:
相关问题 更多 >
编程相关推荐