使用python下载价格问题的回答

使用python下载价格

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

第一个问题：数据实际上是在一个框架中的iframe中；您需要查看<a href="https://www.schwab.wallst.com/public/research/stocks/summary.asp?user_id=schwabpublic&symbol=APC" rel="nofollow">https://www.schwab.wallst.com/public/research/stocks/summary.asp?user_id=schwabpublic&symbol=APC</a>（在这里，您在URL的末尾替换了适当的符号）。在 第二个问题：从页面中提取数据。我个人喜欢lxml和xpath，但是有很多包可以完成这项工作。我可能会期待一些类似 <pre><code>import urllib2 import lxml.html import re re_dollars = '\$?\s*(\d+\.\d{2})' def urlExtractData(url, defs): """ Get html from url, parse according to defs, return as dictionary defs is a list of tuples ("name", "xpath", "regex", fn ) name becomes the key in the returned dictionary xpath is used to extract a string from the page regex further processes the string (skipped if None) fn casts the string to the desired type (skipped if None) """ page = urllib2.urlopen(url) # can modify this to include your cookies tree = lxml.html.parse(page) res = {} for name,path,reg,fn in defs: txt = tree.xpath(path)[0] if reg != None: match = re.search(reg,txt) txt = match.group(1) if fn != None: txt = fn(txt) res[name] = txt return res def getStockData(code): url = 'https://www.schwab.wallst.com/public/research/stocks/summary.asp?user_id=schwabpublic&symbol=' + code defs = [ ("stock_name", '//span[@class="header1"]/text()', None, str), ("stock_symbol", '//span[@class="header2"]/text()', None, str), ("last_price", '//span[@class="neu"]/text()', re_dollars, float) # etc ] return urlExtractData(url, defs) </code></pre> 当被称为 ^{pr2}$ 它回来了 <pre><code>{'stock_name': 'Microsoft Corp', 'last_price': 25.690000000000001, 'stock_symbol': 'MSFT:NASDAQ'} </code></pre> 第三个问题：这个页面上的标记是表示性的，而不是结构性的——这意味着基于它的代码可能是脆弱的，即页面结构的任何更改（或页面之间的变化）都需要重新编写XPath。在 希望有帮助！在

使用python下载价格

1 个回答

相关Python问题