我正试图从一些html元素中解析表格内容,并以定制的方式排列它们,以便以后可以将它们相应地写入csv文件中
该表看起来几乎完全像this
Html元素类似(截断):
<tr>
<td align="center" colspan="4" class="header">ATLANTIC</td>
</tr>
<tr>
<td class="black10bold">Facility</td>
<td class="black10bold">Type</td>
<td class="black10bold">Funding</td>
</tr>
<tr>
<td style="width: 55%">
<a href="fsFacilityDetails.aspx?item=NJ60104"> Complete Care at Linwood, LLC </a>
</td>
</tr>
<tr>
<td style="width: 55%">
<a href="fsFacilityDetails.aspx?item=NJ60102">The Health Center At Galloway</a>
</td>
</tr>
<tr>
<td align="center" colspan="4" class="header">BERGEN</td>
</tr>
<tr>
<td class="black10bold">Facility</td>
<td class="black10bold">Type</td>
<td class="black10bold">Funding</td>
</tr>
<tr>
<td style="width: 55%">
<a href="fsFacilityDetails.aspx?item=30201">The Actors Fund Homes</a>
</td>
</tr>
<tr>
<td style="width: 55%">
<a href="fsFacilityDetails.aspx?item=NJAL02007"> Actors Fund Home, The </a>
</td>
</tr>
到目前为止,我已经尝试过:
for item in soup.select("tr"):
try:
header = item.select_one("td.header").text
except AttributeError:
header = ""
try:
item_name = item.select_one("td > a").text
except AttributeError:
item_name = ""
print(item_name,header)
它产生的输出:
ATLANTIC
Complete Care at Linwood, LLC
The Health Center At Galloway
BERGEN
The Actors' Fund Homes
Actors Fund Home, The
我想要的输出:
Complete Care at Linwood, LLC ATLANTIC
The Health Center At Galloway ATLANTIC
The Actors' Fund Homes BERGEN
Actors Fund Home, The BERGEN
这将以您希望的方式生成输出
希望它能帮助你
相关问题 更多 >
编程相关推荐