使用BeautifulSoup从html解析表并将其保存为cs时出现问题

import requests import csv import requests from bs4 import BeautifulSoup r = requests.get('https://pqt.cbp.gov/report/YYZ_1/12-01-2017') soup = BeautifulSoup(r) table = soup.find('table', attrs={ "class" : "table-horizontal-line"}) headers = [header.text for header in table.find_all('th')] rows = [] for row in table.find_all('tr'): rows.append([val.text.encode('utf8') for val in row.find_all('td')]) with open('output_file.csv', 'wb') as f: writer = csv.writer(f) writer.writerow(headers) writer.writerows(row for row in rows if row)

3条回答

网友

1楼 · 编辑于 2024-10-01 19:23:02

尝试：

r = requests.get('https://pqt.cbp.gov/report/YYZ_1/12-01-2017')
soup = BeautifulSoup(r.content)

网友

2楼 · 编辑于 2024-10-01 19:23:02

变量r是类型Response不是str，使用r.text或r.content并且没有类table-horizontal-line的表，您的意思是results

soup = BeautifulSoup(r.text)
table = soup.find('table', attrs={"class" : "results"})

网友

3楼 · 编辑于 2024-10-01 19:23:02

我会这样做的

import pandas as pd
result = pd.read_html("https://pqt.cbp.gov/report/YYZ_1/12-01-2017")
df = result[0]
# df = df.drop(labels='Unnamed: 8', axis=1)
df.to_csv(r'C:\Users\User\Desktop\Data.csv', sep=',', encoding='utf-8',index = False )

相关问题更多 >

编程相关推荐

热门问题

热门文章