擅长:python、mysql、java
<p>欢迎来到这里。这是另一种方法,脚本将遍历所有表(页)并获取数据。在</p>
<pre><code>df_list = []
url = 'https://www.cartolafcbrasil.com.br/scouts/cartola-fc-2018/rodada-1' #+ str(i)
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
table = soup.find_all('table')[0]
df = pd.read_html(str(table), encoding="UTF-8")
driver = webdriver.PhantomJS(executable_path = 'C:\\Python27\\phantomjs-2.1.1-windows\\bin\\phantomjs')
driver.get('https://www.cartolafcbrasil.com.br/scouts/cartola-fc-2018/rodada-1')
# get the number of pages and iterate each of them
numberOfPage = driver.find_element_by_xpath("(//tr[@class='tbpaging']//a)[last()]").text
for i in range(2,int(numberOfPage)):
# click on each page link and then get the details
driver.find_element_by_xpath("(//tr[@class='tbpaging']//a)[" + i +"]").click()
soup = BeautifulSoup(driver.page_source, 'html.parser')
table = soup.find_all('table')[0]
df = pd.read_html(str(table), encoding="UTF-8")
</code></pre>