检查无限多个自生成的URL的有效性，如果有效，则安全响应（http 200）问题的回答

检查无限多个自生成的URL的有效性，如果有效，则安全响应（http 200）

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我想检查一个无限数量的自我生成的网址的有效性，如果有效的安全主体的响应在一个文件。URL看起来是这样的：<a href="https://mydomain.com/" rel="nofollow">https://mydomain.com/</a>+随机字符串（例如<a href="https://mydomain.com/ake3t" rel="nofollow">https://mydomain.com/ake3t</a>），我想用字母表“abcdefghijklmnopqrstuvwxyz0123456789”生成它们，然后用暴力尝试所有可能的方法。在 我用python编写了一个脚本，但由于我是一个绝对的初学者，它非常慢！因为我需要的东西非常快，所以我试着用了，因为我认为它是专门为这种工作。在 现在的问题是我不知道如何动态地生成url，我不能预先生成它们，因为它不是固定数量的url。在 有人能告诉我如何做到这一点，或者向我推荐另一种更适合这项工作的工具或库吗？在 更新：这是我用的脚本，但我觉得它很慢。最让我担心的是，如果我使用多个线程（在threadsNr中指定），它会变慢 <pre><code>import threading, os import urllib.request, urllib.parse, urllib.error threadsNr = 1 dumpFolder = '/tmp/urls/' charSet = 'abcdefghijklmnopqrstuvwxyz0123456789_-' Url_pre = 'http://vorratsraum.com/' Url_post = 'alwaysTheSameTail' # class that generate the words class wordGenerator (): def __init__(self, word, charSet): self.currentWord = word self.charSet = charSet # generate the next word set that word as currentWord and return the word def nextWord (self): self.currentWord = self._incWord(self.currentWord) return self.currentWord # generate the next word def _incWord(self, word): word = str(word) # convert to string if word == '': # if word is empty return self.charSet[0] # return first char from the char set wordLastChar = word[len(word)-1] # get the last char wordLeftSide = word[0:len(word)-1] # get word without the last char lastCharPos = self.charSet.find(wordLastChar) # get position of last char in the char set if (lastCharPos+1) < len(self.charSet): # if position of last char is not at the end of the char set wordLastChar = self.charSet[lastCharPos+1] # get next char from the char set else: # it is the last char wordLastChar = self.charSet[0] # reset last char to have first character from the char set wordLeftSide = self._incWord(wordLeftSide) # send left site to be increased return wordLeftSide + wordLastChar # return the next word class newThread(threading.Thread): def run(self): global exitThread global wordsTried global newWord global hashList while exitThread == False: part = newWord.nextWord() # generate the next word to try url = Url_pre + part + Url_post wordsTried = wordsTried + 1 if wordsTried == 1000: # just for testing how fast it is exitThread = True print( 'trying ' + part) # display the word print( 'At URL ' + url) try: req = urllib.request.Request(url) req.addheaders = [('User-agent', 'Mozilla/5.0')] resp = urllib.request.urlopen(req) result = resp.read() found(part, result) except urllib.error.URLError as err: if err.code == 404: print('Page not found!') elif err.code == 403: print('Access denied!') else: print('Something happened! Error code', err.code) except urllib.error.URLError as err: print('Some other error happened:', err.reason) resultFile.close() def found(part, result): global exitThread global resultFile resultFile.write(part +"\n") if not os.path.isdir(dumpFolder + part): os.makedirs(dumpFolder + part) print('Found Part = ' + part) wordsTried = 0 exitThread = False # flag to kill all threads newWord = wordGenerator('',charSet); # word generator if not os.path.isdir(dumpFolder): os.makedirs(dumpFolder) resultFile = open(dumpFolder + 'parts.txt','a') # open file for <a href="https://www.cnpython.com/list/append" class="inner-link">append</a> for i in range(threadsNr): newThread().start() </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

检查无限多个自生成的URL的有效性，如果有效，则安全响应（http 200）

1 个回答

相关Python问题