计算控制台中的URL而不是进程b

import requests import csv from lxml import html URL_LIST = [ "https://www.realestate.com.au/property/1-1-goldsmith-st-elwood-vic-3184", "https://www.realestate.com.au/property/1-10-albion-rd-glen-iris-vic-3146", "https://www.realestate.com.au/property/1-109-sydney-rd-manly-nsw-2095", "https://www.realestate.com.au/property/1-1110-glen-huntly-rd-glen-huntly-vic-3163",] with open('test.csv', 'wb') as csv_file: writer = csv.writer(csv_file) for index, url in enumerate(URL_LIST): page = requests.get(url) print 'scanning url....' if text2search in page.text: tree = html.fromstring(page.content) (title,) = (x.text_content() for x in tree.xpath('//title')) (price,) = (x.text_content() for x in tree.xpath('//div[@class="property-value__price"]')) (sold,) = (x.text_content().strip() for x in tree.xpath('//p[@class="property-value__agent"]')) writer.writerow([title, price, sold])

2条回答

网友

1楼 · 编辑于 2024-09-21 02:56:30

如果您想打印一个指示器而不是一个进度条来显示您的进度，最简单的方法可能是进行常规打印。你知道吗

因为问题中的代码是针对Python2的，所以我最初是用Python2代码来回答的，但是这个问题对于Python3用户来说同样容易出现，所以我也为他们添加了一个部分。你知道吗

Python2的一个版本

以下内容基于并应补充问题代码：

for index, url in enumerate(URL_LIST):
    print 'Scanning url #' + str(index+1) + ' of ' + str(len(URL_LIST))

您还可以选择使用url循环生成的for变量添加正在扫描的url。你知道吗

另外，如果您想让每个print替换最后一个，可以在print语句的末尾添加逗号,，并在开头添加\r字符：

for index, url in enumerate(URL_LIST):
    print '\rScanning url #' + str(index+1) + ' of ' + str(len(URL_LIST)),

逗号防止print在末尾添加新行字符（\n），开头的\r（“回车”）会在打印行的其余部分之前删除行中已经存在的内容。你知道吗

python2和python3之间的`print`差异

需要注意的是，print在python2和python3中的功能有很大的不同。上面的“python2”解决方案在python3中不起作用。你知道吗

首先，python3中的print是一个函数，而不是关键字，因此必须将其作为函数调用（即print('Print me!')），其次，在末尾添加逗号不会阻止新行字符的输出。通常在结尾包含逗号将不会有可见的效果，但是解释器正在评估它（作为包含单个None的元组），这可以在使用Python REPL时看到。相反，必须为print函数提供一个命名参数（名为end），以覆盖它的默认值。你知道吗

python3的一个版本

下面是一个相当于我在这个答案顶部提供的代码的python3：

for index, url in enumerate(URL_LIST):
    print('Scanning url #' + str(index+1) + ' of ' + str(len(URL_LIST)))

如果您想让每个打印重复使用同一行，如上面的第二个示例所示：

for index, url in enumerate(URL_LIST):
    print('\rScanning url #' + str(index+1) + ' of ' + str(len(URL_LIST)), end='')

如果您没有阅读以上所有内容，请注意end=''将覆盖print函数的默认操作，即在每行末尾添加一个\n（换行符）字符，这样它将添加一个空字符串，而字符串开头的\r（回车符）字符将使Python返回到开头以打印字符串的其余部分。你知道吗

网友
2楼 · 编辑于 2024-09-21 02:56:30

tqdm是一个强大的进度条库。它让你做这样的事情。你知道吗
import tqdm t = tqdm.tqdm(list('abcdefg')) for i in t: import time time.sleep(1) t.set_postfix(url=i)
进度条输出为：
86%|██████████████████████████▏ | 6/7 [00:06<00:01, 1.00s/it, url=f]

Python2的一个版本

python2和python3之间的`print`差异

python3的一个版本

相关问题更多 >

编程相关推荐

热门问题

热门文章