Python beaututiful soup“ResultSet”对象没有属性“get”

2024-10-01 15:36:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图抓取一些链接的网站,并把他们写到一个文件后,他们已经清理干净。网站上的链接如下所示:

<a href="javascript:changeChannel('http://dr01-lh.akamaihd.net/i/dr01_0@147054/index_1700_av-b.m3u8', 20);">DR1</a><br>
<a href="javascript:changeChannel('http://dr02-lh.akamaihd.net/i/dr02_0@147055/index_1700_av-b.m3u8', 21);">DR2</a><br>
<a href="javascript:changeChannel('http://dr03-lh.akamaihd.net/i/dr03_0@147056/index_1700_av-b.m3u8', 701);">DR3</a><br>
<a href="javascript:changeChannel('http://dr06-lh.akamaihd.net/i/dr06_0@147059/index_1700_av-b.m3u8', 31);">DR Ultra</a><br>
<a href="javascript:changeChannel('http://dr04-lh.akamaihd.net/i/dr04_0@147057/index_1700_av-b.m3u8', 38);">DR K</a><br>
<a href="javascript:changeChannel('http://dr05-lh.akamaihd.net/i/dr05_0@147058/index_1700_av-b.m3u8', 50);">DR Ramasjang</a><br>

我可以用这个抓住它们:

^{pr2}$

给我这个输出:

[<a href="javascript:changeChannel('http://dr01-lh.akamaihd.net/i/dr01_0@147054/index_1700_av-b.m3u8', 20);">DR1</a>, <a href="javascript:changeChannel('http://dr02-lh.akamaihd.net/i/dr02_0@147055/index_1700_av-b.m3u8', 21);">DR2</a>, <a href="javascript:changeChannel('http://dr03-lh.akamaihd.net/i/dr03_0@147056/index_1700_av-b.m3u8', 701);">DR3</a>, <a href="javascript:changeChannel('http://dr06-lh.akamaihd.net/i/dr06_0@147059/index_1700_av-b.m3u8', 31);">DR Ultra</a>, <a href="javascript:changeChannel('http://dr04-lh.akamaihd.net/i/dr04_0@147057/index_1700_av-b.m3u8', 38);">DR K</a>, <a href="javascript:changeChannel('http://dr05-lh.akamaihd.net/i/dr05_0@147058/index_1700_av-b.m3u8', 50);">DR Ramasjang</a>]

现在我想清理一下,这样我只得到了“”之间的http://部分,而这正是问题的症结所在。在

我试过了

fullink = links.get('href')

我得到的错误:

'ResultSet' object has no attribute 'get'

那么我如何从中获取链接呢?在


Tags: brhttpindexnetjavascripthrefdrm3u8
1条回答
网友
1楼 · 发布于 2024-10-01 15:36:57

Beautiful Soup documentation上写着:

AttributeError: 'ResultSet' object has no attribute 'foo' - This usually happens because you expected find_all() to return a single tag or string. But find_all() returns a list of tags and strings–a ResultSet object. You need to iterate over the list and look at the .foo of each one. Or, if you really only want one result, you need to use find() instead of find_all().

所以你可能想要full_links = [x.get("href") for x in links]。在

相关问题 更多 >

    热门问题