擅长:python、mysql、java
<p>这个怎么样:</p>
<pre><code>urls = []
for document in urls.find():
url = document['url'].split('.')[1]
urls.append(url)
url_dict = {u:True for u in urls}
urls2 = posts.find({"url":1})
for url in urls2:
if url not in url_dict.keys():
print(url, " url not found in posts, generating a new report")
try:
get_report(url, posts)
...
</code></pre>
<p>这实际上是在内存中加载所有内容。如果你没有足够的内存,尝试任意散列你的网址和处理一个接一个。你知道吗</p>