使用BeautifulSoup和Python在poorlyformed表中获取一列

def scraper(first, second, third): url = "https://www.austintexas.gov/financeonline/contract_catalog/OCCViewMA.cfm?cd=%s&dd=%d&id=%s" % (first, second, third) soup = BeautifulSoup(urllib2.urlopen(url).read()) foundtext = soup.find('td',text="Commodity Description") table = foundtext.findPrevious('table') rows = table.findAll('tr') second_column = [] for row in rows: print row.contents

1条回答

网友

1楼 · 发布于 2024-10-01 02:30:33

对于找到的每一行，查找所有td元素并按索引获取所需的元素：

table = soup.find('td', text="Commodity Description").find_parent("table")
for row in table.select("tr")[2:]:  # skipping the header rows
    cell = row.find_all("td")[1]
    print(cell.get_text())
    print("  ")

印刷品：

WATERLINE REPLACEMENTCONSTRUCTION, PIPELINEPER YUEJIAO LIU, ADD THE REMAINING FUNDS BACK INTO THIS FUNDING LINE  //   PEMBERTON HEIGHTS PHASE III PROJECT  ++   ENC.  $53,209.97
  
WATERLINE REPLACEMENTCONSTRUCTION, PIPELINEPEMBERTON HEIGHTS PHASE III PROJECT
  
WATERLINE REPLACEMENTCONSTRUCTION, PIPELINEPEMBERTON HEIGHTS PHASE III PROJECT

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用BeautifulSoup和Python在poorlyformed表中获取一列

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >