用BeautifulSoup/Python从html文件中提取文本

<li class="toclevel-1 tocsection-1"> <a href="#Baden-Württemberg">1 Baden-Württemberg </a> </li> <li class="toclevel-1 tocsection-2"> <a href="#Bayern"> 2 Bayern </a> </li> <li class="toclevel-1 tocsection-3"> <a href="#Berlin"> 3 Berlin </a> </li>

2条回答

网友

1楼 · 编辑于 2024-09-29 23:33:19

有了一份理解列表，你可以做到以下几点：

names = soup.find_all("span",{"class":"toctext"})
print([x.text for x in names])

网友

2楼 · 编辑于 2024-09-29 23:33:19

find_all方法返回一个列表。遍历列表以获取文本

for name in names:
    print(name.text)

退货：

Baden-Württemberg
Bayern
Berlin

内置的python dir()和type()方法总是便于检查对象

print(dir(names))

[...,
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort',
 'source']

相关问题更多 >

编程相关推荐

热门问题

热门文章

用BeautifulSoup/Python从html文件中提取文本

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >