从websi获取数据的Python代码

2024-05-19 12:34:38 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用以下代码。 得到[]。 请帮我找出我的错误。在

from urllib import urlopen
optionsUrl = 'http://www.moneycontrol.com/commodity/'
optionsPage = urlopen(optionsUrl)
from bs4 import BeautifulSoup
soup = BeautifulSoup(optionsPage)
print soup.findAll(text='MCX')

Tags: 代码fromimportcomhttpwww错误urllib
1条回答
网友
1楼 · 发布于 2024-05-19 12:34:38

这将为您获取商品列表(在Python2.7上测试)。您需要隔离该商品表,然后向下逐行读取每一行并从每列中提取数据

import urllib2
import bs4

# Page with commodities
URL = "http://www.moneycontrol.com/commodity/"

# Download the page data and create a BeautitulSoup object
commodityPage = urllib2.urlopen(URL)
commodityPageText = commodityPage.read()
commodityPageSoup = bs4.BeautifulSoup(commodityPageText)

# Extract the div with the commodities table and find all the table rows
commodityTable = commodityPageSoup.find_all("div", "equity com_ne")
commodittTableRows =  commodityTable[0].find_all("tr")

# Trim off the table header row
commodittTableRows = commodittTableRows[1:]

# Iterate over the table rows and print out the commodity name and price
for commodity in commodittTableRows:
    # Separate all the table columns
    columns = commodity.find_all("td")

    #         Get the values from each column
    # ROW 1: Name and date
    nameAndDate = columns[0].text
    nameAndDate = nameAndDate.split('-')
    name = nameAndDate[0].strip()
    date = nameAndDate[1].strip()

    # ROW 2: Price
    price = float(columns[1].text)

    # ROW 3: Change
    change = columns[2].text.replace(',', '') # Remove commas from change value
    change = float(change)

    # ROW 4: Percentage change
    percentageChange = columns[3].text.replace('%', '') # Remove the percentage symbol
    percentageChange = float(percentageChange)

    # Print out the data
    print "%s on %s was %.2f - a change of %.2f (%.2f%%)" % (name, date, price, change, percentageChange)

结果就出来了

^{pr2}$

相关问题 更多 >