如何将结果从htm返回到表格或csv格式

2024-07-08 10:11:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图建立一个从投注网站刮我可以在NFL赛季运行到Excel/DB的赔率,但由于我是非常新的python和bs4我遇到了麻烦。你知道吗

我正在使用python3.7.4和BS4

import requests
from bs4 import BeautifulSoup
result2 = requests.get("https://www.betfair.com/sport/american-football/nfl-kampe/green-bay-packers-chicago-bears/29202049")
src2 = result2.content
soup = BeautifulSoup(src2, 'lxml')

for item in soup.find_all('div', {'class': 'minimarketview-content'}):
    print(item.text)

我希望输出如下:

"Green Bay Packers", "2.3", "Chicago Bears", "1.55"
"Green Bay Packers", "1.7","+3,5", "Chicago Bears", "2.0","-3.5"

当前结果(大换行):

Green Bay Packers

2.3

Chicago Bears

1.55

Green Bay Packers

1.7

+3,5

etc

Tags: importgreencontentitemrequestsbearssoupbay
2条回答

我无法访问该网站,因为它在我所在的公共wifi防火墙后面被阻止,所以我无法测试下面的代码,但不是打印项目,而是将它们放入列表中。然后将该列表转换为dataframe/table。比如:

注意:仍然需要做一些工作来清理,但这会让您继续

import requests
from bs4 import BeautifulSoup
import pandas as pd

result2 = requests.get("https://www.betfair.com/sport/american-football/nfl-kampe/green-bay-packers-chicago-bears/29202049")
src2 = result2.content
soup = BeautifulSoup(src2, 'lxml')

data = []
for item in soup.find_all('div', {'class': 'minimarketview-content'}):
    temp_data = [ alpha for alpha in item.text.split('\n') if alpha != '' ] 
    data.append(temp_data)

df = pd.DataFrame(data)
print(df)

df.to_csv('file.csv')

输出:

print (df.to_string())
                               0     1                      2              3                          4      5                  6     7
0              Green Bay Packers  11/8          Chicago Bears           8/13                       None   None               None  None
1              Green Bay Packers   3/4                   +3.5  Chicago Bears                      11/10   -3.5               None  None
2                Current Points:  Over                  20/23            +46                      Under  19/20                +46  None
3  Green Bay Packers by 1-13 Pts   2/1  Green Bay Packers 14+            5/1  Chicago Bears by 1-13 Pts    6/4  Chicago Bears 14+  10/3

我想你可以用空格代替新线符号吗?你知道吗

import csv
with open('filename.csv', 'a') as csv_file:
    for item in soup.find_all('div', {'class': 'minimarketview-content'}):

        x = item.text.replace('\n',',')
        writer = csv.writer(csv_file)
        writer.writerow([x])

编辑: 添加了保存到.csv文件。你知道吗

相关问题 更多 >

    热门问题