XPath删除列表Python中的空白

2024-09-28 20:47:51 发布

您现在位置:Python中文网/ 问答频道 /正文

我尝试了我所知道的一切,但似乎没有找到解决办法。在

import csv
import requests
from lxml import html
from itertools import izip

list_names_atp = []
page = requests.get('http://www.atpworldtour.com/en/rankings/singles')
tree = html.fromstring(page.content)

list_rank_atp = []
for i in range(0,101):
    result = tree.xpath('//*[@id="rankingDetailAjaxContainer"]/table/tbody/tr[%s]/td[1]/text()'%(i))
    list_rank_atp.append(result)

list_names_atp = []
for i in range(0,101):
    result1 = tree.xpath('//*[@id="rankingDetailAjaxContainer"]/table/tbody/tr[%s]/td[4]/a/text()'%(i))
    list_names_atp.append(result1)

list_Final =[]
for i in izip(list_rank_atp, list_names_atp):
    uitkom = i
    list_Final.append(uitkom)

outfile = open("./tennis.csv", "wb")
writer = csv.writer(outfile)
writer.writerow(["Rank", "Name"])
writer.writerows(list_Final)    

csv输出如下:

enter image description here

但我希望它是:

enter image description here


Tags: csvinfromimporttreefornameshtml
2条回答

可以使用strip()方法删除空间。在

以下是一些注意事项:

  • XPath索引从1开始,而不是0。这就是为什么第一个数据行的条目是空的。

  • 可以使用Python的strip()或XPath的normalize-space()删除行号文本周围的空格

我建议迭代行(tr),并在每次迭代中从当前行获取所需的所有信息:

page = requests.get('http://www.atpworldtour.com/en/rankings/singles')
tree = html.fromstring(page.content)
outfile = open("./tennis.csv", "wb")
writer = csv.writer(outfile)

rows = tree.xpath('//*[@id="rankingDetailAjaxContainer"]/table/tbody/tr')
writer.writerow(["Rank", "Name"])

for row in rows:
    no = row.xpath('td[1]/text()')[0].strip()
    name = row.xpath('td[4]/a/text()')[0]
    writer.writerow([no, name])

outfile.close()

相关问题 更多 >