Python正则表达式没有返回我正在查找的内容

2024-09-27 00:21:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我刮一个网站,并希望得到一个特定的标签内的内容。 我想要得到的标签是:<pre class="js-tab-content"></pre>

这是我的密码:

request = urllib.request.Request(url=url)
response = urllib.request.urlopen(request)
content = response.read().decode()

tab = re.findall(r'<pre class="js-tab-content">(.*?)</pre>', content)

print(tab)

当我打印标签时,我得到一个空列表[]

以下是我正在搜索的内容:

.... <pre class="js-tab-content"><i></i><span>Em</span>              <span>D</span>              <span>Em</span>             <span>D</span>

Lift M
ac Cahir Og your face, brooding o'er the old disgrace 

     <span>Em</span>                  <span>D</span>                       <span>G</span>-<span>D</span>-<span>Em</span>     

That black Fitzwilliam stormed your place and drove you to the Fern.

<span>Em</span>              <span>D</span>           <span>Em</span>                         <span>D</span>

Gray said victory was sure, soon the firebrand he'd secure

<span>Em</span>                <span>D</span>          <span>G</span>-<span>D</span>-<span>Em</span>

Until he met at Glenmalure, Feach Mac Hugh O'Byrne 



Chorus:

<span>G</span>                                <span>D</span>

Curse and swear, Lord Kildare, Feach will do what Feach will dare

<span>G</span>                               <span>G</span>-<span>D</span>-<span>Em</span>

Now Fitzwilliam have a care, fallen is your star low

<span>G</span>                                       <span>D</span> 

Up with halbert, out with sword, on we go for by the Lord

<span>G</span>                               <span>G</span>-<span>D</span>-<span>Em</span>

Feach Mac Hugh has given his word: Follow me up to Carlow 



From Tassagart ____to Clonmore flows a stream of Saxon Gore

Great is Rory Og O'More at sending loons to Hades.

White is sick and Lane is fled, now for black Fitzwilliams head

We'll send it over, dripping red, to Liza and her ladies



See the swords of Glen Imayle flashing o'er the English Pale

See all the children of the Gael, beneath O'Byrne's banners

Rooster of the fighting stock, would you let an Saxon cock

Crow out upon an Irish rock, fly up and teach him manners

</pre> ....

我不明白为什么返回的是空列表,而不是列表中包含内容的字符串。你知道吗

我在网上查了大约半个小时,找不到任何帮助。你知道吗

对不起,如果我在这里看起来很愚蠢,如果它是如此明显!你知道吗

无论如何,提前谢谢!你知道吗


Tags: andoftheto内容isrequest标签
2条回答
tab = re.findall(r'<pre class="js-tab-content">(.*?)</pre>', content, re.S)

.需要^{}来匹配换行符。你知道吗

好的,要添加到注释中,下面是如何使用^{}HTML解析器在本例中提取pre文本:

from bs4 import BeautifulSoup

soup = BeautifulSoup(content, "html.parser")
print(soup.find("pre", class_="js-tab-content").get_text())

相关问题 更多 >

    热门问题