美化输出到lis的组元素

import requests from bs4 import BeautifulSoup url = 'https://www.planetware.com/tourist-attractions-/oslo-n-osl-oslo.htm' url_get = requests.get(url) soup = BeautifulSoup(url_get.content, 'html.parser') attraction_place=soup.find_all('h2', class_="sitename") for attraction in attraction_place: print(attraction.text) type(attraction)

1 Vigeland Sculpture Park 2 Akershus Fortress 3 Viking Ship Museum 4 The National Museum 5 Munch Museum 6 Royal Palace 7 The Museum of Cultural History 8 Fram Museum 9 Holmenkollen Ski Jump and Museum 10 Oslo Cathedral 11 City Hall (Rådhuset) 12 Aker Brygge 13 Natural History Museum & Botanical Gardens 14 Oslo Opera House and Annual Music Festivals Where to Stay in Oslo for Sightseeing Tips and Tours: How to Make the Most of Your Visit to Oslo More Related Articles on PlanetWare.com

3条回答

网友

1楼 · 编辑于 2024-06-26 00:07:02

new = []
count = 1
for attraction in attraction_place:
    while count < 15:
        text = attraction.text
        new.append(text)
        count += 1

网友

2楼 · 编辑于 2024-06-26 00:07:02

你可以用切片。你知道吗

for attraction in attraction_place[:14]:
    print(attraction.text)
    type(attraction)

网友

3楼 · 编辑于 2024-06-26 00:07:02

一个很简单的方法是获取照片的alt属性。这样可以得到干净的文本输出，并且只有14个文本，而不需要切片/索引。你知道吗

from bs4 import BeautifulSoup
import requests

r = requests.get('https://www.planetware.com/tourist-attractions-/oslo-n-osl-oslo.htm')
soup = bs(r.content, 'lxml')
attractions = [item['alt'] for item in soup.select('.photo [alt]')]
print(attractions)

相关问题更多 >

编程相关推荐

热门问题

热门文章