从CSV读取url,获取HTTP响应状态,检查重定向,存储在新CSV中

2024-07-04 05:03:40 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个csv文件,里面有很多网址。我正在编写一个脚本,它执行以下操作:

  1. 从csv读取URL
  2. 将url列表传递给请求.get() 然后呢
  3. 获取初始http状态以及重定向的url和http状态

当我只传入一个url而不尝试从另一个csv读取url时,我的代码工作得很好。然而,在规模上,最有意义的是从一个文件中读取大量url,然后将输出处理到另一个文件。你能帮我理解我的代码在哪里出故障了吗?提前谢谢。在

# -*- coding: utf-8 -*-
import requests
import csv
import time

# set filename with date + time
timestr = time.strftime("%m-%d-%Y-%H-%M-%S")
filename = str("redirect-output\\redirect-test-"+timestr+".csv")

with open('bulk-url-filename.csv', 'rb') as f:
    reader = csv.reader(f)
    for row in reader:
        urls = row[0]
        #print urls
        r = requests.get(urls)

f = open(filename, 'a+')
response = requests.get(r)
if response.history:
    print "Request was redirected:"
    reqredirected = "Request was redirected:"
    f.write(reqredirected)
    f.write("\n")
    for resp in response.history:
        print resp.status_code
        status = str(resp.status_code)
        f.write(status)
        f.write("\n")
        print resp.url
        url = str(resp.url)
        f.write(url)
        f.write("\n")
    print "Final destination:"
    final = "Final destination:"
    f.write(final)
    f.write("\n")
    print response.status_code
    destinationstatus = str(response.status_code)
    f.write(destinationstatus)
    f.write("\n")
    print response.url
    destinationurl = str(response.url)
    f.write(destinationurl)
    print "\n"
    f.write("\n")
else:
    print "Request was not redirected"
    noredirect = "Request was not redirected"
    f.write(noredirect)
    f.write("\n")
    responseurl = response.url
    print responseurl
    f.write(responseurl)
    f.write("\n")

我可以在以后的csv输出中看到输出。在


Tags: 文件csvurlgetresponserequeststatuscode

热门问题