从网页中删除数据后无法生成某些自定义输出

2024-09-24 02:19:42 发布

您现在位置：Python中文网/ 问答频道 /正文

1871

网友

男 | 程序猿一只，喜欢编程写python代码。

我试图将数据附加到字典中，同时从WebPage中删除相同的数据。我现在的输出不是我希望如何安排它们。这是webpage

我试过：

import requests
from bs4 import BeautifulSoup
from pprint import pprint

url = 'https://elllo.org/english/grammar/L1-01-AimeeTodd-Intros-BeVerb.htm'
data = []

r = requests.get(url)
soup = BeautifulSoup(r.text,"lxml")
for item in soup.select("#transcript p"):
    d = {}

    if "Aimee:" in item.text:
        d['Aimee'] = item.text.replace("Aimee:","").strip()

    elif "Todd:" in item.text:
        d['Todd'] = item.text.replace("Todd:","").strip()

    data.append(d)

pprint(data)

我得到的结果是：

[{'Aimee': 'So Todd, where are you from?'},
 {'Todd': "I am from the U.S., I am from San Francisco. It's on the west "
          'coast.'},
 {'Aimee': 'And what do you do?'},
 {'Todd': "I'm an English teacher. Also, I create Elllo. I work on Elllo a "
          'lot.'}

预期产出：

[{'Aimee': 'So Todd, where are you from?','Todd': "I am from the U.S., I am from San Francisco. It's on the west "
          'coast.'},

 {'Aimee': 'And what do you do?','Todd': "I'm an English teacher. Also, I create Elllo. I work on Elllo a "
          'lot.'},

How can I produce the second output?

Tags： the text in from import you data on

1条回答

网友

1楼 · 发布于 2024-09-24 02:19:42

r = requests.get(url)
soup = BeautifulSoup(r.text,"lxml")
d = {}
for item in soup.select("#transcript p"):

    if "Aimee:" in item.text:
        d['Aimee'] = item.text.replace("Aimee:","").strip()

    elif "Todd:" in item.text:
        d['Todd'] = item.text.replace("Todd:","").strip()
        data.append(d)
        d = {}

pprint(data)

从网页中删除数据后无法生成某些自定义输出

相关问题更多 >

编程相关推荐

热门问题

热门文章

从网页中删除数据后无法生成某些自定义输出

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >