BeautifulSoup访问DC bikesh中可用的自行车

<station> <id>1</id> <name>15th & S Eads St</name> <terminalName>31000</terminalName> <lastCommWithServer>1460217337648</lastCommWithServer> <lat>38.858662</lat> <long>-77.053199</long> <installed>true</installed> <locked>false</locked> <installDate>0</installDate> <removalDate/> <temporary>false</temporary> <public>true</public> <nbBikes>7</nbBikes> <nbEmptyDocks>8</nbEmptyDocks> <latestUpdateTime>1460192501598</latestUpdateTime> </station>

# bikeShareParse.py - parses the capital bikeshare info page import bs4, requests url = "https://www.capitalbikeshare.com/data/stations/bikeStations.xml" res = requests.get(url) res.raise_for_status() #create the soup element from the file soup = bs4.BeautifulSoup("res.text", "lxml") # defines the part of the page we are looking for nbikes = soup.select('#text') #limits number of results for testing numOpen = 5 for i in range(numOpen): print nbikes

1条回答

网友

1楼 · 发布于 2024-06-25 06:37:53

这个脚本创建了一个结构为[station\u ID，bikes\u remaining]的字典。它从以下内容开始修改：http://www.plotsofdots.com/archives/68

# from http://www.plotsofdots.com/archives/68


import xml.etree.ElementTree as ET
import urllib2

#we parse the data using urlib2 and xml
site='https://www.capitalbikeshare.com/data/stations/bikeStations.xml'
htm=urllib2.urlopen(site)
doc = ET.parse(htm)

#we get the root tag
root=doc.getroot()
root.tag

#we define empty lists for the empty bikes
sID=[]
embikes=[]
#we now use a for loop to extract the information we are interested in
for country in root.findall('station'):
    sID.append(country.find('id').text)
    embikes.append(int(country.find('nbBikes').text))

#this just tests that the process above works, can be commented out
#print embikes
#print sID

#use zip to create touples and then parse them into a dataframe
prov=zip(sID,embikes)

print prov[0]

相关问题更多 >

编程相关推荐

热门问题

热门文章