使用Python和Beautiful soup从运动桌上获取数据

import urllib2 from bs4 import BeautifulSoup page = 'http://stats.nba.com/leagueTeamGeneral.html?pageNo=1&rowsPerPage=30' page = urllib2.urlopen(page) soup = BeautifulSoup(page) for dS in soup.find_all(???): print(dS.get(???))

2条回答

网友

1楼 · 编辑于 2024-06-26 13:40:29

谢谢你的建议，效果不错。我最后用的是

    import json
    from pprint import pprint

    with open('NBA_DATA.json') as data_file:
    data = json.load(data_file)

    #Have this here for debug purpose just to see output
    pprint(data["resultSets"])

    for hed in data["resultSets"]:
        s1 = hed["headers"]
        s2 = hed["rowSet"]
        #more debugging
        #pprint(hed["headers"])
        #pprint(hed["rowSet"])
        list_of_s1 = list(hed["headers"])
        list_of_s2 = list(hed["rowSet"])

网友

2楼 · 编辑于 2024-06-26 13:40:29

使用firefox firebug这样的工具来跟踪您需要的html调用，查看您在firebug的“net”选项卡中共享的链接，就会发现您所追求的数据是在对http://www.nba.com/cmsinclude/desktopWrapperHeader_jsonp.html的后续请求调用中获得的它实际上包含json数据，不确定beauthoulsoup在这里是否方便，请尝试使用python json加载它

相关问题更多 >

编程相关推荐

热门问题

热门文章