使用Python将网站表单输出的选定部分保存为CSV文件

from mechanize import Browser br = Browser() # Ignore robots.txt br.set_handle_robots( False ) # Google demands a user-agent that isn't a robot br.addheaders = [('User-agent', 'Chrome')] # Retrieve the Google home page, saving the response br.open('http://fahrplan.sbb.ch/bin/query.exe/en') # # Show the available forms # counter = 0 # for f in br.forms(): # counter += 1 # print f, counter # print 'counter', counter # Enter the text inpur br.select_form(nr=6) br.form[ "REQ0JourneyStopsS0G" ] = 'Leverkusen Mitte' br.form[ "REQ0JourneyStopsZ0G" ] = 'Pescara Centrale' # Get the search results br.submit() print br.response().read() # How can I export the result to csv???

2条回答

网友

1楼 · 编辑于 2024-09-30 03:23:24

如另一个答案所述，您可以使用HTML解析器（如BeautifulSoup）解析响应，选择所需的每个值，将它们放入逗号分隔的字符串中，然后将其写入文件。在

下面的示例代码可以让您更好地理解：

from mechanize import Browser
from bs4 import BeautifulSoup

# get the response from mechanize Browser

soup = BeautifulSoup(response, 'html.parser')
trs = soup.select('table.hfs_overview tr')
with open('out.csv', 'a+') as f:
    for tr in trs:
        locations = tr.select('td.location.departure a')
        if len(locations) > 0:
            location = locations[0].contents[0].strip()
            prefix = tr.select('td.prefix')[0].contents[0].strip()
            time = tr.select('td.time')[0].contents[0].strip()
            # parse more values here
            # write to file
            f.write("{},{},{}\n".format(location, prefix, time))

网友

2楼 · 编辑于 2024-09-30 03:23:24

如果您在Google的Chrome源代码控制台中查看结果HTML页面的源代码，您将找到四个结果。以下是第一个结果的出发部分截图：

您可以通过使用mycapture中用黄色突出显示的文本搜索控制台来获得剩余的结果。现在您只需要使用Beautiful Soup来刮取和切片这个HTML代码，然后将切片的部分保存到CSV文件中。在

相关问题更多 >

编程相关推荐

热门问题

热门文章