Python:通过BeautifulSoup搜索单个标记

2024-09-23 08:09:47 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图从BBC最受欢迎的栏目中列出十大新闻文章。我的代码如下:

from bs4 import BeautifulSoup, SoupStrainer
import urllib2
import re

opener = urllib2.build_opener()

url = 'http://www.bbc.co.uk/news/popular/read'

soup = BeautifulSoup(opener.open(url), "lxml")

titleTag = soup.html.head.title

print(titleTag.string)

tagSpan = soup.find_all("span");

for tag in tagSpan:
    print(tag.get("class"))

我要找的是<span class="most-popular-page-list-item__headline"></span>之间的字符串

如何获取字符串并列出这些字符串的列表?你知道吗


Tags: 字符串importurltagopenerurllib2classspan
1条回答
网友
1楼 · 发布于 2024-09-23 08:09:47

这个怎么样:

from bs4 import BeautifulSoup
from urllib.request import urlopen

url = 'http://www.bbc.co.uk/news/popular/read'

page = urlopen(url)
soup = BeautifulSoup(page, "lxml")
titles = soup.findAll('span', {'class': "most-popular-page-list-item__headline"})
headlines = [t.text for t in titles]

相关问题 更多 >