BeautifulSoup未分析div类

2024-10-03 04:34:47 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试处理此页:

https://play.google.com/store/movies/details?id=3B6EBBD94D13B4DCMV

我使用以下代码阅读HTML:

from BeautifulSoup import BeautifulSoup as BS
import requests

def read_html(url):
  try:
     res = requests.get(url)
     if res.status_code == 200:
        html_content = res.content
        soup = BS(html_content)
        return _get_type(soup)           
      else:
        print res.status_code
  except ValueError, e:
     print e


def _get_type(soup):   
  """Read Movie.""" 

  mydivs = soup.findAll("span", {"class": "DBzzzb"})
  if mydivs:
    return 'AVAILABLE'

  mydivs = soup.findAll("span", {"class": "DBzzzb"})
  if mydivs:
    return 'PREORDER'

  mydivs = soup.findAll("div", {"class": "Wc4pU"})
  if mydivs:
    return 'NOT_AVAILABLE'

  return 'INVALID'

我的条件从不匹配:soup.findAll("div", {"class": "Wc4pU"}即使那里实际上有HTML代码:

<div class="Wc4pU">We'll notify you on your wishlist when movies become available</div>

源HTML:

view-source:https://play.google.com/store/movies/details?id=3B6EBBD94D13B4DCMV

Tags: httpsdivgetreturnifhtmlrescontent