使用collections.namedtuple
,下面的Python代码通过数据库中记录的标识符(列ContentItemId
中的整数)的csv文件工作。一个示例记录是https://api.aucklandmuseum.com/id/library/ephemera/21291。你知道吗
其目的是检查给定id的HTTP状态并将其写入磁盘:
import requests
from collections import namedtuple
import csv
with open('in.csv', mode='r') as f:
reader = csv.reader(f)
all_records = namedtuple('rec', next(reader))
records = [all_records._make(row) for row in reader]
#Create output file
with open('out.csv', mode='w+') as o:
w = csv.writer(o)
w.writerow(["ContentItemId","code"])
count = 1
for r in records:
id = r.ContentItemId
url = "https://api.aucklandmuseum.com/id/library/ephemera/" + id
req = requests.get(url, allow_redirects=False)
code = req.status_code
w.writerow([id, code])
如何通过后一个循环将代码的进度(最好是25%、50%和75%的连接点)打印到控制台?另外,如果我在底部添加一个未缩进的print("Complete")
,会到达那一行吗?你知道吗
提前谢谢。你知道吗
编辑:谢谢你的帮助。我的(正在工作!)代码现在如下所示:
import csv
import requests
import pandas
import time
from collections import namedtuple
from tqdm import tqdm
with open('active_true_pub_no.csv', mode='r') as f:
reader = csv.reader(f)
all_records = namedtuple('rec', next(reader))
records = [all_records._make(row) for row in reader]
with open('out.csv', mode='w+') as o:
w = csv.writer(o)
w.writerow(["ContentItemId","code"])
num = len(records)
print("Checking {} records...\n".format(num))
with tqdm(total=num, bar_format="{percentage:3.0f}% {bar} [{n_fmt}/{total_fmt}] ", ncols=64) as pbar:
for r in records:
pbar.update(1)
id = r.ContentItemId
url = "https://api.aucklandmuseum.com/id/library/ephemera/" + id
req = requests.get(url, allow_redirects=False)
code = req.status_code
w.writerow([id, code])
# time.sleep(.25)
print ('\nSummary: ')
df = pandas.read_csv("out.csv")
print(df['code'].value_counts())
我用pandas
'^{
要获取进度条,请使用TQDM:
数据(来自
in.csv
):代码:
for-loop
前面加了with tqdm(total=len(records)) as pbar:
21/101
,这是通过records
列表长度的计数。tqdm
提供百分比进度条和计数complete/total
这都是相对的,所以让我们做一些一般的数学。:)
我假设你指的是已经处理的记录的百分比。您也可以在循环中执行
print("Complete")
。你知道吗相关问题 更多 >
编程相关推荐