Python：如何使用LXML/Requests遍历HTML元素对象？

<div class="houses"> <input type="hidden" class="houseNumber" value="107"> <input type="hidden" class="houseState" value="MT"> <input type="hidden" class="houseStatus" value="Occupied"> <div class="houseInfo"> <div class="houseCity">Helena</div> <div class="houseArea">Helena Valley</div> </div> </div> <div class="houses"> <input type="hidden" class="houseNumber" value="237"> <input type="hidden" class="houseState" value="MT"> <input type="hidden" class="houseStatus" value="Occupied"> <div class="houseInfo"> <div class="houseCity">East Helena</div> <div class="houseArea">Helena Valley</div> </div> </div> <div class="houses"> <input type="hidden" class="houseNumber" value="104"> <input type="hidden" class="houseState" value="MT"> <input type="hidden" class="houseStatus" value="Vacant"> <div class="houseInfo"> <div class="houseCity">Helena</div> <div class="houseArea">Helena Valley</div> </div> </div>

['107', '237', '104'] ['MT', 'MT', 'MT'] ['Occupied', 'Occupied', 'Vacant'] ['Helena', 'East Helena', 'Helena'] ['Helena Valley', 'Helena Valley', 'Helena Valley'] ['107', '237', '104'] ['MT', 'MT', 'MT'] ['Occupied', 'Occupied', 'Vacant'] ['Helena', 'East Helena', 'Helena'] ['Helena Valley', 'Helena Valley', 'Helena Valley'] ['107', '237', '104'] ['MT', 'MT', 'MT'] ['Occupied', 'Occupied', 'Vacant'] ['Helena', 'East Helena', 'Helena'] ['Helena Valley', 'Helena Valley', 'Helena Valley']

link = "example.com" headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'} response = requests.get(link, headers=headers, allow_redirects=False) sourceCode = response.content htmlElem = html.document_fromstring(sourceCode) houses = htmlElem.find_class('houses') for house in houses: houseNumber = house.xpath('//input[@class="houseNumber"]/@value') houseState = house.xpath('//input[@class="houseState"]/@value') houseStatus = house.xpath('//input[@class="houseStatus"]/@value')

link = "example.com" headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'} response = requests.get(link, headers=headers, allow_redirects=False) sourceCode = response.content htmlElem = html.document_fromstring(sourceCode) houses = htmlElem.find_class('houses') houseNumber = [] houseState = [] houseStatus = [] for house in houses: houseNumber.append(house.xpath('//input[@class="houseNumber"]/@value')) print(houseNumber) houseState.append(house.xpath('//input[@class="houseState"]/@value')) houseStatus.append(house.xpath('//input[@class="houseStatus"]/@value')) data = map(list, zip(*[houseNumber,houseState,houseStatus]))

1条回答

网友

1楼 · 发布于 2024-06-02 10:24:20

尝试转换结果，请参见this thread以理解我的代码。在

# create a list with elements
houseNumber = []
houseState = []
houseStatus = []

# append each element to it's list
for house in houses:
    houseNumber.append(house.xpath('//input[@class="houseNumber"]/@value'))
    houseState.append(house.xpath('//input[@class="houseState"]/@value'))
    houseStatus.append(house.xpath('//input[@class="houseStatus"]/@value'))


# transpose the lists, and turn into a list of list
data = map(list, zip(*[houseNumber,houseState,houseStatus]))

>>> list(data)
#[['107', 'MT', 'Occupied'], ['237', 'MT', 'Occupied'], ['104', 'MT', 'Vacant']]

如果可以将其用作元组，只需移除映射即可

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章