擅长:python、mysql、java
<p>您可以使用<code>selenium</code>来加载页面,然后使用<code>BeautifulSoup</code>来查找播放器属性:</p>
<pre><code>from selenium import webdriver
from bs4 import BeautifulSoup as soup
import re
import collections
player = collections.namedtuple('player', ['name', 'position', 'stats'])
d = webdriver.Chrome('/Users/jamespetullo/Downloads/chromedriver')
d.get('http://www.espn.com/mlb/boxscore?gameId=370403101')
player_names = iter([b.text for b in soup(d.page_source, 'lxml').find_all('td', {'class':'name'})])
full_stats = [i.text for i in h.find_all('td', {'class':re.compile('batting-stats')})]
final_results = {next(player_names):full_stats[i:i+11] for i in range(0, len(full_stats), 11)}
final_players = [player(*[re.sub('[A-Z\d\-\s\(\),]+$', '', a), (lambda x:'N/A' if not x else x[0])(re.findall('[A-Z\d\-\s\(\),]+$', a)), b]) for a, b in final_results.items()]
</code></pre>
<p>输出:</p>
^{pr2}$
<p>结果还生成<code>"D. Travis"</code>的完整统计信息:</p>
<pre><code>[u'2-6', u'6', u'0', u'2', u'0', u'0', u'2', u'16', u'.333', u'.333', u'.333']
</code></pre>