python：将网站数据转换为txt或xls

2024-09-30 16:23:45 发布

您现在位置：Python中文网/ 问答频道 /正文

9765

网友

男 | 程序猿一只，喜欢编程写python代码。

我不太擅长python，我想从网站上获取数据，表格中的数据，我想要txt/xls格式的数据

我做了一个脚本，但当我的脚本去网站，它的工作良好，直到一个条目来了，而没有数据。你知道吗

网站：bizearch.com你知道吗

在此条目中，我的python脚本停止：www.bizearch.com/company/Russell\ u Metal\ u Products\ u Inc\ u 125558.htm你知道吗

我正在使用CentOS、Python、BeautifulSoup。你知道吗

我的剧本：

#/usr/bin/env python
#
from bs4 import BeautifulSoup
import urllib

getInfo = ['Company Name', 'Contact Person', 'Company Address', 'Postal Code', 'Telephone Number', 'Mobile Number', 'Fax Number', 'Website', 'Business Type', 'Business Role']
flushData = {}

print "Company Name|Contact Person|Company Address|Postal Code|Telephone Number|Mobile Number|Fax Number|Website|Business Type|Business Role"

for Page in range(1,900):
    pageData = urllib.urlopen("http://www.bizearch.com/company/Electrical_Equipment~Supplies.8-%d.htm" % (Page))
    html = pageData.read()
    parsed_html = BeautifulSoup(html)

    for Row in parsed_html.body.findAll('div', attrs={'class':'ls'}):
        profileURL =  Row.find('a').get('href')
        profileURLHTML = urllib.urlopen(profileURL)
        profileURLHTML = BeautifulSoup(profileURLHTML)

        finalData = []
        for Details in profileURLHTML.body.find('div', attrs={'id':'yellowpage'}).findAll('tr') :
            if Details.find('th').text in getInfo:
                flushData[Details.find('th').text] = Details.find('td').text

                flushDataPrint = "%s|%s|%s|%s|%s|%s|%s|%s|%s|%s" % (flushData['Company Name'], flushData['Contact Person'], flushData['Company Address'], flushData['Postal Code'], flushData['Telephone Number'], flushData['Mobile Number'], flushData['Fax Number'], flushData['Website'], flushData['Business Type'], flushData['Business Role'])
        print flushDataPrint

我是这个网站的新手，如果我错过了什么，请道歉。你知道吗

Tags：数据 in 脚本 com number 网站 html business

0条回答

目前没有回答

python：将网站数据转换为txt或xls

相关问题更多 >

编程相关推荐

热门问题

热门文章

python：将网站数据转换为txt或xls

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >