擅长:python、mysql、java
<p>如果没有BeautifulSoap,您可以使用RegExp和simple函数。你知道吗</p>
<pre><code>from urllib.request import urlopen
import re
def find_link(url):
response = urlopen(url)
res = str(response.read())
my_dict = re.findall('(?<=<a href=")[^"]*', res)
for x in my_dict:
# simple skip page bookmarks, like #about
if x[0] == '#':
continue
# simple control absolute url, like /about.html
# also be careful with redirects and add more flexible
# processing, if needed
if x[0] == '/':
x = url + x
print(x)
find_link('http://cnn.com')
</code></pre>