我试图从HTML中提取一些文本(更详细地说)具体来说,是的是Mdx,一种使用python中BeautifulSoup函数的#dictionary文件-它运行得很好,但我没有得到我需要的东西。#我的代码如下:
from bs4 import BeautifulSoup
from lxml import etree
html = '''
abandon <link href="LM5style_vanilla.css" rel="stylesheet" type="text/css" /><link href="LM5style.css" rel="stylesheet" type="text/css" /><link href="LM5style_switch.css" rel="stylesheet" type="text/css" /><link href="LM5style_show.css" rel="stylesheet" type="text/css" /><script src="jquery-3.2.1.min.js" charset="utf-8" type="text/javascript" language="javascript"></script><script src="LM5Switch.js" charset="utf-8" type="text/javascript" language="javascript"></script><span class="lm5ppbody"><div class="entry_content"><h1 class="pagetitle" pagetype="0">abandon</h1><div class="dictionary"><div class="wordfams"><span class="LDOCE5pp_sensefold foldsign_fold"><span class="asset_intro">Word family</span><span class="foldsign"><span class="foldblank"> </span><span class="foldsignbar1"></span><span class="foldsignbar2"></span></span></span><span class="LDOCE_word_family" style="display:none;"> <span class="pos">noun</span> <span class="w" title="abandonment">abandonment</span> <span class="pos">adjective</span> <a class="crossRef w" href="bword://abandoned" title="abandoned">abandoned</a> <span class="pos">verb</span> <span class="w" title="abandon">abandon</span> </span></div><!-- End of DIV wordfams--><span class="dictentry"><span class="dictionary_intro span"><span class="lm5ppMenu"><span id="lm5ppMenu_logo"> </span><span class="lm5ppMenu_title"><span class="en_title">Longman Dictionary of Contemporary English 5++</span><span class="cn_title"><span class="cn_txt_menu">朗文当代英语 5++</span></span></span><span class="lm5ppMenu_title mini"><span class="en_title">LDOCE 5++</span><span class="cn_title"><span class="cn_txt_menu">朗文 5++</span></span></span></span></span><span class="dictlink"><a name="abandon__entry_0__a"></a><span class="ldoceEntry Entry" id="abandon__entry_0"><span class="frequent Head"><span class="HWD">a<span class="HYP"><span class="HYP">·</span></span>ban<span class="HYP"><span class="HYP">·</span></span>don</span><span class="HOMNUM">1</span><a class="PronCodes" href="sound://media/english/ameProns/abandon1.mp3"><span class="neutral span"> /</span><span class="PRON">əˈbændən</span><span class="neutral span">/</span></a> <span class="tooltip LEVEL" title="Core vocabulary: Medium-frequency"> ●●○</span> <span class="FREQ" title="Top 3000 written words">W3</span> <span class="AC" title="Academic Word list">AWL</span><span class="lm5pp_POS"> verb</span><span class="GRAM"><span class="neutral span"> [</span>transitive<span class="neutral span">]</span></span><a class="speaker brefile fa fa-volume-up" data-src-mp3="/media/english/breProns/abandon_v0205.mp3" href="sound://media/english/breProns/abandon_v0205.mp3" title="Play British pronunciation of abandon"> </a><a class="speaker amefile fa fa-volume-up" data-src-mp3="/media/english/ameProns/abandon1.mp3" href="sound://media/english/ameProns/abandon1.mp3" title="Play American pronunciation of abandon"> </a></span><a name="abandon__1__a"></a><span class="newline Sense" id="abandon__1"><span class="LDOCE5pp_sensefold"><span class="sensenum span">1</span><span class="foldsign"><span class="foldblank"> </span><span class="foldsignbar1"></span><span class="foldsignbar2"></span></span></span> <span class="ACTIV">LEAVE A RELATIONSHIP</span><span class="DEF LDOCE_switch_lang switch_siblings">to leave someone, especially someone you are <a class="defRef" href="bword://responsible" title="responsible">responsible</a> for</span><span class="DEF LDOCE_switch_lang switch_siblings"> <span class="cn_txt"> 抛弃,遗弃〔某人〕</span></span><span class="RELATEDWD"><span class="neutral span"> → </span><a href="bword://abandoned"> abandoned</a></span><span class="EXAMPLE"><a class="speaker exafile fa fa-volume-up" href="sound://media/english/exaProns/p008-000963493.mp3" title="Play Example"> </a><span class="english LDOCE_switch_lang switch_children">How could she abandon her own child?<span class="cn_txt"> 她怎么能抛弃自己的孩子呢?</span></span></span></span><a name="abandon__2__a"></a><span class="newline Sense" id="abandon__2"><span class="LDOCE5pp_sensefold"><span class="sensenum span">2</span><span class="foldsign"><span class="foldblank"> </span><span class="foldsignbar1"></span><span class="foldsignbar2"></span></span></span> <span class="ACTIV">LEAVE A PLACE</span><span class="DEF LDOCE_switch_lang switch_siblings">to go away from a place, <a class="defRef" href="bword://vehicle" title="vehicle">vehicle</a> etc permanently, especially because the situation makes it <a class="defRef" href="bword://impossible" title="impossible">impossible</a> for you to stay</span><span class="DEF LDOCE_switch_lang switch_siblings"> <span class="cn_txt"> 离弃,逃离〔某地方、交通工具等〕</span></span><span class="SYN"> <span class="synopp span">SYN</span><a href="bword://leave"> leave</a></span><span class="RELATEDWD"><span class="neutral span">, → </span><a href="bword://abandoned"> abandoned</a></span><span class="EXAMPLE"><a class="speaker exafile fa fa-volume-up" href="sound://media/english/exaProns/p008-000963497.mp3" title="Play Example"> </a><span class="english LDOCE_switch_lang switch_children">We had to abandon the car and walk the rest of the way.<span class="cn_txt"> 我们只好弃车,步行走完剩下的路。</span></span></span><span class="EXAMPLE"><a class="speaker exafile fa fa-volume-up" href="sound://media/english/exaProns/p008-000963498.mp3" title="Play Example"> </a><span class="english LDOCE_switch_lang switch_children">Fearing further attacks, most of the population had abandoned the city.<span class="cn_txt"> 因为害怕还要受到袭击,大多数市民已逃离该市。</span></span></span></span><a name="abandon__3__a"></a><span class="newline Sense" id="abandon__3"><span class="LDOCE5pp_sensefold"><span class="sensenum span">3</span><span class="foldsign"><span class="foldblank"> </span><span class="foldsignbar1"></span><span class="foldsignbar2"></span></span></span> <span class="ACTIV">STOP DOING something</span><span class="DEF LDOCE_switch_lang switch_siblings">to stop doing something because there are too many problems and it is impossible to continue</span><span class="DEF LDOCE_switch_lang switch_siblings"> <span class="cn_txt"> 放弃,中止</span></span><span class="EXAMPLE"><a class="speaker exafile fa fa-volume-up" href="sound://media/english/exaProns/p008-000963502.mp3" title="Play Example"> </a><span class="english LDOCE_switch_lang switch_children">The game had to be abandoned due to bad weather.<span class="cn_txt"> 由于天气不好,比赛不得不中止。</span></span></span><span class="EXAMPLE"><a class="speaker exafile fa fa-volume-up" href="sound://media/english/exaProns/p008-001732862.mp3" title="Play Example"> </a><span class="english LDOCE_switch_lang switch_children">They <span class="COLLOINEXA">abandoned</span> their <span class="COLLOINEXA">attempt</span> to recapture the castle.<span class="cn_txt"> 他们放弃了夺回城堡的努力。</span></span></span><span class="EXAMPLE"><a class="speaker exafile fa fa-volume-up" href="sound://media/english/exaProns/p008-001776706.mp3" title="Play Example"> </a><span class="english LDOCE_switch_lang switch_children">Because of the fog they <span class="COLLOINEXA">abandoned</span> their <span class="COLLOINEXA"<span>someone, </span><span>you </span></div></div>\n</span>\n
'''
soup = BeautifulSoup(html, 'lxml')
context = soup.find_all(class_="english LDOCE_switch_lang switch_children")
print(context)
#this is what it runs:[<span class="english LDOCE_switch_lang switch_children">How could she abandon her own child?<span class="cn_txt"> 她怎么能抛弃自己的孩子呢?</span></span>, <span class="english LDOCE_switch_lang switch_children">We had to abandon the car and walk the rest of the way.<span class="cn_txt"> 我们只好弃车,步行走完剩下的路。</span></span>, <span class="english LDOCE_switch_lang switch_children">Fearing further attacks, most of the population had abandoned the city.<span class="cn_txt"> 因为害怕还要受到袭击,大多数市民已逃离该市。</span></span>,
我需要的是所有的英文和中文样本如下:
How could she abandon her own child?
她怎么能抛弃自己的孩子呢?
我试了好几天。请帮帮我。 非常感谢!你知道吗
我希望我能正确理解你的问题。如果您想提取英文短语和中文对应项,可以使用此示例(我不懂任何中文,因此无法验证此输出是否正确):
印刷品:
只需添加如下循环:
结果:
相关问题 更多 >
编程相关推荐