用Python上的WebCrawler打印文章

2024-09-23 04:27:15 发布

男 | 程序猿一只，喜欢编程写python代码。

我是Python的新手，我正在尝试制作一个只打印文章（例如这个网站-http://techcrunch.com/2014/09/15/microsoft-has-acquired-minecraft/）而不打印网站上其他内容的Web爬虫程序。我试过了（但这没用）：

source_code = requests.get('http://techcrunch.com/2014/09/15/microsoft-has-acquired-minecraft/')
plain_text = source_code.text
soup = BeautifulSoup(plain_text)

for link in soup.findAll('div', {'class': 'article-entry text'}):
    title = link.string
    print(title)

上面印着：“没有” 泰铢

Tags： text com http source title 网站 link code

1条回答

网友

1楼 · 发布于 2024-09-23 04:27:15

您只希望文章代替for循环：

for link in soup.findAll('div', {'class': 'article-entry text'}):
  title = link.string
  print(title)

做到：

^{pr2}$

你只会得到标题和文章。在

有关BeautifulSoup的文档可能会有所帮助。在

用Python上的WebCrawler打印文章

相关问题更多 >

编程相关推荐

热门问题

热门文章

用Python上的WebCrawler打印文章

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >