在html文档中使用beautiful soup查找名称

2024-09-26 22:55:16 发布

您现在位置:Python中文网/ 问答频道 /正文

嘿,我试了一段时间了,我不知道怎么用汤.找功能。我要找的物品用“名称”标识:如果它在这样的东西里,我怎么找到它。正文上下续。你知道吗

,"100002078216989":{"watermark":1488952059387,"action":1488954831234},"100002219436413":{"watermark":1488717577383,"action":1488717619845},"100003348640283":{"watermark":1489154862229,"action":1489158262774},"100004986371453":{"watermark":1489154862229,"action":1489154866065}}],[]],["MDynaTemplate","registerTemplates",[],[{"URLg3i":["MMessageSourceTextTemplate","\u003Cspan class=\"source mfss fcg\">[[text]]\u003C/span>"],"DHGslp":["MMessageSourceTextWithLinkTemplate","\u003Cspan class=\"mfss fcg\">\u003Ca href=\"[[\u0025UNESCAPED]][[download_href]]\">[[text]]\u003C/a>\u003C/span>"],"vSvEYy":["MReadReceiptTextTemplate","\u003Cspan class=\"mfss fcg\">[[text]]\u003C/span>"]}],[]],["MShortProfiles","set",[],["Value",{"id":"Value","name":"Value","firstName":"Value","vanity":"Value","thumbSrc":null


Tags: text功能名称valueaction物品标识class
1条回答
网友
1楼 · 发布于 2024-09-26 22:55:16

以下是我的解决方案:

def get_name(self, file):

    s = BeautifulSoup(open(file), "lxml")
    for item in s.find("p"):
        print("The base item: \n" +item + "\n")
        item = item.split("name\":\"")
        print("1st split: \n" + item[-1] + "\n")
        item = item[-1].split("\",\"")
        print("2nd split: \n" + item[0] + "\n")

输出:

The base item: 
"100002078216989":{"watermark":1488952059387,"action":1488954831234},"100002219436413":{"watermark":1488717577383,"action":1488717619845},"100003348640283":{"watermark":1489154862229,"action":1489158262774},"100004986371453":{"watermark":1489154862229,"action":1489154866065}}],[]],["MDynaTemplate","registerTemplates",[],[{"URLg3i":["MMessageSourceTextTemplate","\u003Cspan class=\"source mfss fcg\">[[text]]\u003C/span>"],"DHGslp":["MMessageSourceTextWithLinkTemplate","\u003Cspan class=\"mfss fcg\">\u003Ca href=\"[[\u0025UNESCAPED]][[download_href]]\">[[text]]\u003C/a>\u003C/span>"],"vSvEYy":["MReadReceiptTextTemplate","\u003Cspan class=\"mfss fcg\">[[text]]\u003C/span>"]}],[]],["MShortProfiles","set",[],["Value",{"id":"Value","name":"Value","firstName":"Value","vanity":"Value","thumbSrc":null

1st split: 
Value","firstName":"Value","vanity":"Value","thumbSrc":null

2nd split: 
Value

事实上,你的html文件不是一个完美的格式。所以我能找到的最好办法就是这样。但是,它可以满足你的需要。你知道吗

相关问题 更多 >

    热门问题