Python在同一容器中排序相似的div

import requests from bs4 import BeautifulSoup, Tag from lxml import html import requests import MySQLdb import urllib2 import itertools import re import sys from datetime import date, timedelta as td, datetime urls =("http://www.esportsheaven.net/?page=match") hdr = {'User-Agent': 'Mozilla/5.0'} req = urllib2.Request(urls,headers=hdr) page = urllib2.urlopen(req) soup = BeautifulSoup(page) tournament=soup.findAll('div',{'class':['content-body']}) match_time = soup.find_all("div", style = "width:10%; float:left;") match = soup.find_all("div", style = "width:46%; float:left; margin-left:2%; margin-right:2%") tourny = soup.find_all("div", style = "width:40%; float:left; overflow:hidden;") for tag in tournament: for tag in match_time: print tag.text for tag1 in match: print tag1.text for tag2 in tourny: print tag2.text print '==============='

1条回答

网友

1楼 · 发布于 2024-06-28 20:47:01

在提取元素方面，您的解析代码是正确的。但是，match\u time、math和tourny的find方法应该与变量tournament有关，而不是soup。搜索与变量soup有关的任何内容都会搜索整个文档。其中as关于tournament的搜索只搜索您感兴趣的content div。你知道吗

如果查看HTML页面，只有一个类为content-body的div。所以，find_all调用没有意义。所以我们要：

tournament = soup.find('div',{'class':['content-body']})

现在我们找到所有匹配的时间，匹配的名字和图尼的

match_times = tournament.find_all("div", style = "width:10%; float:left;")
match_names = tournament.find_all("div", style = "width:46%; float:left; margin-left:2%; margin-right:2%")
tournys = tournament.find_all("div", style = "width:40%; float:left; overflow:hidden;")

所有三个数组的长度都相同。因此，我们按如下方式压缩它们以访问它们：

for element in zip(match_times, match_names, tournys):
    print element[0].text, element[1].text, element[2].text

这会给你你想要的。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章