Python循环用于web报废

import csv import requests from bs4 import BeautifulSoup page = requests.get("http://www.realcommercial.com.au/sold/property-offices- retail-showrooms+bulky+goods-land+development-hotel+leisure+medical+consulting-other-in-wa/list-1?includePropertiesWithin=includesurrounding&activeSort=list-date&autoSuggest=true") soup = BeautifulSoup(page.content, 'html.parser') Address_1 = soup.find('p', attrs ={'class' :'details-panel__address'}) Address = Address.text.strip()

p class="details-panel__address" data-reactid="90">GF 255 Adelaide TerracePerth, WA 6000369-371 Oxford StreetMount Hawthorn, WA 6016, p class="details-panel__address" data-reactid="148">2 Lloyd Street Midland, WA 6056, p class="details-panel__address" data-reactid="172">Bluenote Building, 16/162 Colin StreetWest Perth, WA 6005, p class="details-panel__address" data-reactid="196">Bluenote Building, 10/162 Colin StreetWest Perth, WA 6005

1条回答

网友

1楼 · 发布于 2024-09-28 01:23:55

soup.find_all返回元素。到获取您必须遍历元素列表才能提取具有text属性的文本的文本。你知道吗

import requests 

from bs4 import BeautifulSoup

page = requests.get("""http://www.realcommercial.com.au/sold/property-offices-
  retail-showrooms+bulky+goods-land+development-hotel+leisure+medical+consulting-other-in-wa/list-1?includePropertiesWithin=includesurrounding&activeSort=list-date&autoSuggest=true""")

soup = BeautifulSoup(page.content, 'html.parser')

Address_1 = soup.find_all('p', attrs ={'class' :'details-panel__address'})
address_list = [ address.text.strip() for address in Address_1]
print(address_list)
links = soup.find_all('a', attrs ={'class' :'details-panel'})
hrefs = [link['href'] for link in links]
print(hrefs)
# Now iterate through the list of urls and extract the required data

相关问题更多 >

编程相关推荐

热门问题

热门文章