Python客户端窗体E

2条回答

网友

1楼 · 编辑于 2024-06-25 22:58:34

问题很可能是HTML本身无效-例如它重复使用了 id="comment_form" over and over again, while there is only supposed to be one id of a given name per document.

最好的解决方案可能是先使用beauthoulsoup来解析你的urlopen页面结果，然后把它打印回ClientForm的一个字符串中——这很可能会消除大部分粗糙的边缘，并给ClientForm更好的完成任务的机会。在

如果这不起作用，就把结果打印出来，然后计算出你需要在HTML上做什么样的转换，从而使ClientForm的表单变得非常简单——通过删除无关的标记和cruft。在

网友

2楼 · 编辑于 2024-06-25 22:58:34

正如理查德建议的那样，使用beauthoulsoup。在

from BeautifulSoup import BeautifulSoup, SoupStrainer
from StringIO import StringIO
from urllib2 import urlopen
import ClientForm

url='http://garciainteractive.com/blog/topic_view/topics/content/'           

html=urlopen(url).read()
forms_filter=SoupStrainer('form',id="comment_form")
soup = BeautifulSoup(html,parseOnlyThese=forms_filter)
forms = ClientForm.ParseFile(StringIO(soup),"", backwards_compat=False)
forms[0]['name']='Kalmi'
forms[0]['email']='kalmi@..com'

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python客户端窗体E

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >