我可以从一个网站上抓取一些数据,但我有麻烦打破它显示在一个表中。你知道吗
我使用的代码是:
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = 'https://www.basketball-reference.com/leagues/NBA_2018_games.html'
r = requests.get(url)
soup = BeautifulSoup(r.text, "html.parser")
tablesright = soup.find_all('td', 'right',)
Tables left = soup.find_all('td', 'left')
print (tablesright + tablesleft)
结果如下:
====================== RESTART: E:/2017/Python2/box2.py ======================
[<td class="right " data-stat="game_start_time">8:01 pm</td>, <td class="right " data-stat="visitor_pts">99</td>, <td class="right " data- stat="home_pts">102</td>, <td class="right " data-stat="game_start_time">10:30 pm</td>, <td class="right " data-stat="visitor_pts">122</td>, <td class="right " data-stat="home_pts">121</td>, <td class="right " data-stat="game_start_time">7:30 pm</td>, <td class="right " data-stat="visitor_pts">108</td>, <td class="right " data-stat="home_pts">100</td>, <td class="right " data-stat="game_start_time">8:30 pm</td>, <td class="right " data-stat="visitor_pts">117</td>, <td class="right " data-stat="home_pts">111</td>, <td class="right " data-stat="game_start_time">7:00 pm</td>, <td class="right " data-stat="visitor_pts">90</td>, <td class="right " data-stat="home_pts">102</td>, <
左边部分:
<td class="left " csk="BOS.201710170CLE" data-stat="visitor_team_name"><a href="/teams/BOS/2018.html">Boston Celtics</a></td>, <td class="left " csk="CLE.201710170CLE" data-stat="home_team_name"><a href="/teams/CLE/2018.html">Cleveland Cavaliers</a></td>, <td class="left " data-stat="game_remarks"></td>, <td class="left " csk="HOU.201710170GSW" data-stat="visitor_team_name"><a href="/teams/HOU/2018.html">Houston Rockets</a></td>, <td class="left " csk="GSW.201710170GSW" data-stat="home_team_name"><a href="/teams/GSW/2018.html">Golden State Warriors</a></td>, <td class="left " data-stat="game_remarks"></td>, <td class="left " csk="MIL.201710180BOS" data-stat="visitor_team_name"><a href="/teams/MIL/2018.html">Milwaukee Bucks</a></td>, <td class="left " csk="BOS.201710180BOS" data-stat="home_team_name"><a href="/teams/BOS/2018.html">Boston Celtics</a></td>, <td class="left " data-stat="game_remarks"></td>, <td class="left " csk="ATL.201710180DAL" data-
好吧,现在我想不出如何分解结果,这样就有了这样一个漂亮的表:
Game start time Home team. Score. Away team. Score
7pm. Boston. 104. Golden state. 103
把头发拔出来想弄明白
Ta提前谢谢
这样就行了。调整到你的需要,然后使用熊猫。你知道吗
我不知道您是否需要pandas的解决方案,这是一个没有它的解决方案,只需要使用更高级的
attrs
关键字和标准Pythonformat
来获得格式化的表。你知道吗请注意,
format
中的数字是手动选择的,不会根据实际数据进行调整。你知道吗您可以尝试在一个数据帧中读取它,而不是使用html解析器,然后决定如何操作该数据帧以显示所需的结果。你知道吗
示例:
在pandas文档中给出了如何做到这一点的示例,以及关于stackoverflow的许多问题。 酱汁:https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html
相关问题 更多 >
编程相关推荐