正则表达式在bs4中不工作

import re import urllib2 from bs4 import BeautifulSoup def gethtml(link): req = urllib2.Request(link, headers={'User-Agent': "Magic Browser"}) con = urllib2.urlopen(req) html = con.read() return html def findLatest(): url = "https://watchseriesfree.to/serie/Madam-Secretary" head = "https://watchseriesfree.to" soup = BeautifulSoup(gethtml(url), 'html.parser') latep = soup.find("a", title=re.compile('Latest Episode')) soup = BeautifulSoup(gethtml(head + latep['href']), 'html.parser') firstVod = soup.findAll("tr",text=re.compile('rapidvideo')) return firstVod print(findLatest())

1条回答

网友

1楼 · 发布于 2024-06-24 12:49:06

问题在于：

firstVod = soup.findAll("tr",text=re.compile('rapidvideo'))

当BeautifulSoup将应用文本regex模式时，它将使用所有匹配的tr元素的^{} attribute值。现在，.string有一个重要的警告-当一个元素有多个子元素时，.string是None：

If a tag contains more than one thing, then it’s not clear what .string should refer to, so .string is defined to be None.

因此，您没有结果。在

您可以通过使用searching function并调用.get_text()来检查tr元素的实际文本：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章