用Python解析HTML页面时出现问题

1条回答

网友

1楼 · 发布于 2024-09-22 16:29:52

你试过用BeautifulSoup吗？我是个超级粉丝。使用它，你可以很容易地遍历所有你想要的信息，按标签搜索。你知道吗

我把它放在一起，打印出你看到的每一列的值。不知道你想用这些数据做什么，但希望能有所帮助。你知道吗

from bs4 import BeautifulSoup
from urllib import request

page = request.urlopen('http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html').read()
soup = BeautifulSoup(page)

desired_table = soup.findAll('table')[2]

# Find the columns you want data from
headers = desired_table.findAll('th')
desired_columns = []
for th in headers:
    if 'SVENY' in th.string:
        desired_columns.append(headers.index(th))

# Iterate through each row grabbing the data from the desired columns
rows = desired_table.findAll('tr')

for row in rows[1:]:
    cells= row.findAll('td')
    for column in desired_columns:
        print(cells[column].text)

应您的第二个请求：

from bs4 import BeautifulSoup
from urllib import request

page = request.urlopen('http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html').read()
soup = BeautifulSoup(page)

desired_table = soup.findAll('table')[2]
data = {}

# Find the columns you want data from
headers = desired_table.findAll('th')
desired_columns = []
column_count = 0
for th in headers:
    if 'SVENY' in th.string:
        data[th.string] = {'column': headers.index(th), 'data': []}
        column_count += 1

# Iterate through each row grabbing the data from the desired columns
rows = desired_table.findAll('tr')

for row in rows[1:]:
    date = row.findAll('th')[0].text
    cells= row.findAll('td')

    for header,info in data.items():
        column_number = info['column']
        cell_data = [date,cells[column_number].text]
        info['data'].append(cell_data)

这将返回一个字典，其中每个键都是一列的标题，每个值都是另一个字典，其中1）该列位于站点中，2）在列表列表中包含所需的实际数据。你知道吗

例如：

for year_number in data['SVENY01']['data']:
    print(year_number)

['2015-06-05', '0.3487']
['2015-06-04', '0.3124']
['2015-06-03', '0.3238']
['2015-06-02', '0.3040']
['2015-06-01', '0.3009']
['2015-05-29', '0.2957']
etc.

你可以摆弄这个来获得你想要的信息，但是希望这是有用的。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章

用Python解析HTML页面时出现问题

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >