Python scraper总是发现网站的前一个版本和当前版本之间存在差异，而没有

import requests import time import os def compare(file, url): if os.path.isfile("./" + file): scrape = requests.get(url).text with open(file) as f: txt=f.read() if not txt == scrape: with open(file, "w") as f: f.write(scrape) print("Triggered") else: scrape=requests.get(url).text with open(file, "w") as f: f.write(scrape) ceu = "https://hro.ceu.edu/find-job" ceu_file = "ceu.html" while True: compare(ceu, ceu_file) time.sleep(10)

1条回答

网友

1楼 · 发布于 2024-10-01 09:40:25

您需要通过设置newline=''来禁用自动换行符转换，以防止在写入文件时换行符转换为系统默认值：

import requests
import time
import os

def compare(url, file_):
    if os.path.isfile("./" + file_):
        scrape = requests.get(url).text
        with open(file_, "r", newline='') as f:
            txt = f.read()
        if txt != scrape:
            with open(file_, "w", newline='') as f:
                f.write(scrape)
            print("Triggered")
        else:
            print('Not triggered')
    else:
        scrape = requests.get(url).text
        with open(file_, "w", newline='') as f:
            f.write(scrape)

ceu = "https://hro.ceu.edu/find-job"
ceu_file = "ceu.html"

while True:
    compare(ceu, ceu_file)
    time.sleep(10)

相关问题更多 >

编程相关推荐

热门问题

热门文章