无法在beauthulsoup中链接find和find\u all

<tr> <td>Dresser !**<a href="/wiki/Louise_Dresser" title="Louise Dresser">Louise Dresser</a>**</td> <td>Ship !<a href="/wiki/A_Ship_Comes_In" title="A Ship Comes In">A Ship Comes In</a></td> <td>Pleznik !Mrs. Pleznik</td> </tr> <tr> <td>Swanson !<a href="/wiki/Gloria_Swanson" title="Gloria Swanson">Gloria Swanson</a></td> <td><a href="/wiki/Sadie_Thompson" title="Sadie Thompson">Sadie Thompson</a></td> <td>Thompson !Sadie Thompson</td> </tr> <tr> <th scope="row" rowspan="6" style="text-align:center"><a href="/wiki/1928_in_film" title="1928 in film">1928</a>/<a href="/wiki/1929_in_film" title="1929 in film">29</a> <a href="/wiki/2nd_Academy_Awards" title="2nd Academy Awards">(2nd)</a></th> <td style="background:#FAEB86">Pickford !**<a href="/wiki/Mary_Pickford" title="Mary Pickford">Mary Pickford</a>** <img alt="Award winner" src="//upload.wikimedia.org/wikipedia/commons/f/f9/Double-dagger-14-plain.png" width="9" height="14" data-file-width="9" data-file-height="14" /></td>

1条回答

网友

1楼 · 发布于 2024-10-01 09:36:15

import requests
from bs4 import BeautifulSoup

def getActresses(URL):
    res = requests.get(URL)

    try:
        soup = BeautifulSoup(res.content, "lxml")
        table = soup.find("table", {"class":"wikitable sortable"})
    except AttributeError:
        print("Error creating/navigating soup object")

    tr = table.find_all("tr")

    for _tr in tr:
        td = _tr.find_all("td")
        for _td in td:
            a = _td.find_all("a")
            for _a in a:
                print(_a.text.encode("utf-8"))

getActresses("https://en.wikipedia.org/wiki/Academy_Award_for_Best_Actress")

使用text而不是get_text()，很抱歉，我使用了requests模块来演示

find_all方法总是返回一个列表，因此必须循环使用它

相关问题更多 >

编程相关推荐

热门问题

热门文章