检查源HTML、Python的链接中的所有链接问题的回答

检查源HTML、Python的链接中的所有链接

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我的代码是搜索在命令提示符中传递的链接，获取链接处网页的HTML代码，搜索网页上链接的HTML代码，然后对找到的链接重复这些步骤。我希望这是清楚的。你知道吗 它应该打印出任何导致错误的链接。你知道吗 更多需要的信息： 它最多可以访问100次。如果网站有错误，则返回None值。你知道吗 我用的是Python3 例如： <pre><code>s = readwebpage(url)... # This line of code gets the HTML code for the link(url) passed in its argument.... if the link has an error, s = None. </code></pre> 该网站的HTML代码在其网页上有以<code>p2.html</code>、<code>p3.html</code>、<code>p4.html</code>和<code>p5.html</code>结尾的链接。我的代码读取所有这些，但它不会单独访问这些链接来搜索更多的链接。如果这样做了，它应该搜索这些链接并找到一个以p10.html结尾的链接，然后它应该报告以p10.html结尾的链接有错误。很明显，现在还没有，这让我很难受。你知道吗 我的密码。。你知道吗 <pre><code> url = args.url[0] url_list = [url] checkedURLs = [] AmountVisited = 0 while (url_list and AmountVisited<maxhits): url = url_list.pop() s = readwebpage(url) print("testing url: http",url) #Print the url being tested, this code is here only for testing.. AmountVisited = AmountVisited + 1 if s == None: print("* bad reference to http", url) else: urls_list = re.findall(r'href="http([\s:]?[^\'" >]+)', s) #Creates a list of all links in HTML code starting with... while urls_list: #... http or https insert = urls_list.pop() while(insert in checkedURLs and urls_list): insert = urls_list.pop() url_list.append(insert) checkedURLs = insert </code></pre> 请帮忙：）

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

检查源HTML、Python的链接中的所有链接

1 个回答

相关Python问题