如何为每个<li class=”“><a>找到最近的上述同级<li>？

<li><h4>A0: Pronouns</h4></li> <li class=""> <a>bb</a> <a>cc</a> </li> <li class=""> <a>dd</a> <a>ee</a> </li> <li><h4>A0: Verbs Tenses & Conjugation</h4></li> <li class=""> <a>ff</a> <a>gg</a> </li> <li class=""> <a>hh</a> <a>kk</a> </li> <li class=""> <a>jj</a> <a>ii</a> </li>

import requests from bs4 import BeautifulSoup session = requests.Session() headers = { 'Accept-Encoding': 'gzip, deflate, sdch', 'Accept-Language': 'en-US,en;q=0.8', 'Upgrade-Insecure-Requests': '1', 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 'Cache-Control': 'max-age=0', 'Connection': 'keep-alive', } link = 'https://french.kwiziq.com/revision/grammar' r = session.get(link, headers = headers) soup = BeautifulSoup(r.content, 'html.parser') for d in soup.select('.callout-body > ul li > a:nth-of-type(1)'): print(d)

2条回答

网友

1楼 · 编辑于 2024-10-16 17:24:34

您可以使用.find_previous('h4')：

import requests
from bs4 import BeautifulSoup


url = "https://french.kwiziq.com/revision/grammar"

soup = BeautifulSoup(requests.get(url).content, "html.parser")
for a in soup.select(".callout  li > a:nth-of-type(1)"):
    print(
        "{:<70} {}".format(
            a.get_text(strip=True), a.find_previous("h4").get_text(strip=True)
        )
    )

印刷品：

Saying your name: Je m'appelle, Tu t'appelles, Vous vous appelez       A0: Pronouns
Tu and vous are used for three types of you                            A0: Pronouns
Je becomes j' with verbs beginning with a vowel (elision)              A0: Verbs Tenses & Conjugation
J'habite à [city] = I live in [city]                                   A0: Idioms, Idiomatic Usage, and Structures
Je viens de + [city] = I'm from + [city]                               A0: Idioms, Idiomatic Usage, and Structures
Conjugate être (je suis, tu es, vous êtes) in Le Présent (present tense) A0: Verbs Tenses & Conjugation
Make most adjectives feminine by adding -e                             A0: Adjectives & Adverbs
Nationalities differ depending on whether you're a man or a woman (adjectives) A0: Adjectives & Adverbs
Conjugate avoir (j'ai, tu as, vous avez) in Le Présent (present tense) A0: Verbs Tenses & Conjugation
Using un, une to say "a" (indefinite articles)                         A0: Nouns & Articles

...

French vocabulary and grammar lists by theme                           C1: Idioms, Idiomatic Usage, and Structures
French Fill-in-the-Blanks Tests                                        C1: Idioms, Idiomatic Usage, and Structures

网友

2楼 · 编辑于 2024-10-16 17:24:34

您可以在CSS路径中使用:is：

from bs4 import BeautifulSoup as soup
from collections import defaultdict
d, l = defaultdict(list), None
for i in soup1.select('li > :is(a, h4):nth-of-type(1)'):
   if i.name == 'h4':
      l = i.get_text(strip=True)
   else:
      d[l].append(i.get_text(strip=True))

print(dict(d))

输出：

{'A0: Pronouns': ['bb', 'dd'], 'A0: Verbs Tenses & Conjugation': ['ff', 'hh', 'jj']}

输出存储与语法部分关联的每个li的第一个a。如果您只想在组件结果中使用1-1部分，则可以使用字典理解：

new_d = {a:b for a, (b, *_) in d.items()}

输出：

{'A0: Pronouns': 'bb', 'A0: Verbs Tenses & Conjugation': 'ff'}

相关问题更多 >

编程相关推荐

热门问题

热门文章