谷歌应用引擎urlphetch截断页面文本

2024-09-29 01:23:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用gaepython2.5和beautifulsoup3.08,发生了一些事情,切断了我文本的第一部分。在

这是我的代码:

from google.appengine.api import urlfetch
from BeautifulSoup import BeautifulSoup

url = 'http://www.cmegroup.com/CmeWS/mvc/xsltTransformer.do?xlstDoc=/XSLT/da/DailySettlement_CPC-FUT.xsl&url=/da/DailySettlement/V1/DSReport/ProductCode/J4/FOI/FUT/EXCHANGE/XNYM/Underlying/J4?tradeDate=08/16/2012'

print '<hr>This is the raw result fetched (print result.content)<hr>'
result = urlfetch.fetch(url = url, method = urlfetch.GET)
print result.content

soup = BeautifulSoup(result.content)
print '<hr>This is prettified soup (soup.prettify)<hr>'
print soup.prettify()

print '<hr>here is the print out of iteration through the findall<hr>Go!<br>'
trSet = soup.findAll('tr')
if trSet is not None:
  for i in trSet:
    i.string
else:
  print "Couldn't find TRs in Soup!"

运行此代码的应用程序站点是:http://mwp-test2.appspot.com/ 所发生的是第一次打印根本没有发生。有什么想法吗?(另外,我对Beautiful soup的findAll也有问题,但我打算在解决了这个截断问题后再问这个问题)


Tags: the代码fromimporthttpurlishr