Python中使用BeautifulSoup美化方法的奇怪错误问题的回答

Python中使用BeautifulSoup美化方法的奇怪错误

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我有以下问题。我写了一个简单的“TextBasedBrowser”（如果你现在可以称它为browser:D）。到目前为止，使用BS4进行的网站抓取和解析工作非常出色，但其格式非常糟糕，几乎无法阅读。当我尝试使用BS4中的prettify（）方法时，它会抛出一个AttributeError。我在谷歌上搜索了很长时间，但什么也没找到。这是我的代码（prettify（）方法在那里被注释）： <pre><code>from bs4 import BeautifulSoup import requests import sys import os legal_html_tags = ['p', 'a', 'ul', 'ol', 'li', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'title'] saved_pages = [] def search_url(url): saved_pages.append(url.rstrip(".com")) url = requests.get(f'https://{url}') return url.text def parse_html(html_page): final_text = "" soup = BeautifulSoup(html_page, 'html.parser') # soup = soup.prettify() plain_text = soup.find_all(text=True) for t in plain_text: if t.parent.name in legal_html_tags: final_text += '{} '.format(t) return final_text def save_webpage(url, tb_dir): with open(f'{tb_dir}/{url.rstrip(".com")}.txt', 'w', encoding="utf-8") as tab: tab.write(parse_html(search_url(url))) def check_url(url): if url.endswith(".com") or url.endswith(".org") or url.endswith(".net"): return True else: return False args = sys.argv directory = args[1] try: os.mkdir(directory) except FileExistsError: print("Error: File already exists") while True: url_ = input() if url_ == "exit": break elif url_ in saved_pages: with open(f'{directory}/{url_}.txt', 'r', encoding="utf-8") as curr_page: print(curr_page.read()) elif not check_url(url_): print("Error: Invalid URL") else: save_webpage(url_, directory) print(parse_html(search_url(url_))) </code></pre> 这就是错误： <pre><code>Traceback (most recent call last): File "browser.py", line 56, in <module> save_webpage(url_, directory) File "browser.py", line 29, in save_webpage tab.write(parse_html(search_url(url))) File "browser.py", line 20, in parse_html plain_text = soup.find_all(text=True) AttributeError: 'str' object has no attribute 'find_all' </code></pre> 如果我在prettify（）方法中包含encoding参数，它会抛出“bytes”而不是“str”对象

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

Python中使用BeautifulSoup美化方法的奇怪错误

1 个回答

相关Python问题