我试图从这个页面中提取公司描述:https://angel.co/company/sensor-tower但是BeautifulSoup返回了整个页面的文本
我尝试过desc = soup.find('div', class_="content").get_text().strip()
,它对站点上的其他页面有效,但返回了该页面上的所有文本
预期输出应为:
Sensor Tower is a comprehensive mobile market intelligence platform that delivers crucial insights into the global app economy. Our flagship Store Intelligence product is an enterprise level offering that provides high-accuracy, worldwide app download and revenue estimates for Apple's App Store and Google Play.
Our best-of-class research interface, which seamlessly integrates across our Store Intelligence, Ad Intelligence, and App Intelligence products, is utilized by executives and analysts alike to drive key business decisions. Our products are counted on by the app world's largest publishers, Fortune 500 companies, and financial institutions to surface emerging market trends, benchmark performance, and grow app businesses at enterprise scale.
该页上有两个
div
标记,其类为content
。其中一个(我的副本中的第590行)包含很多内容,而另一个(我的副本中的第620行)只包含您要查找的描述。美丽之神回来了使用
find("div", class_="product_desc")
可能会有更好的运气,它似乎选择了您想要的元素相关问题 更多 >
编程相关推荐