Python网络

2条回答

网友

1楼 · 编辑于 2024-05-13 23:03:21

如果要对站点进行爬网，请参见this post。如果您只想处理一些页面并分析其内容（意味着您知道要处理的url），请尝试BeautifulSoup，它允许您执行以下操作：

page = urllib2.urlopen(url)
soup = BeautifulSoup(page.read())
for f in soup.findAll('form'):
    target_url = f['action']
    #do something with each one of the forms

网友

2楼 · 编辑于 2024-05-13 23:03:21

您可以使用Scrapy：

Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

编程相关推荐

具有x86javapath的x64机器上x86java上的java JNI未满足链接错误
java将Pixmap的一部分上传到GPU
图像Java位图RLE8格式
java Android studio谷歌广告崩溃应用程序
java如何创建包含未知数量对象的变量？
Java计算给定int数组的所有可能组合
java JDBC classnotfound异常
httpclient中的java将HttpEntity转换为字符串的最优雅/正确的方法是什么？
如何从Java程序运行nano？
java在安卓中调用自定义类/方法

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python网络

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >