简单的网页刮板格式，我如何解决这个问题？

import requests from bs4 import BeautifulSoup def posts_spider(): url = 'http://www.reddit.com/r/nosleep/new/' source_code = requests.get(url) plain_text = source_code.text soup = BeautifulSoup(plain_text) for link in soup.findAll('a', {'class': 'title'}): href = "http://www.reddit.com" + link.get('href') title = link.string print(title) print(href) print("\n") def get_single_item_data(): item_url = 'http://www.reddit.com/r/nosleep/new/' source_code = requests.get(item_url) plain_text = source_code.text soup = BeautifulSoup(plain_text) for rating in soup.findAll('div', {'class': 'score unvoted'}): print(rating.string) posts_spider() get_single_item_data()

My light.. I'm seeing and feeling things.. what's happening? http://www.reddit.com/r/nosleep/comments/2kw0nu/my_light_im_seeing_and_feeling_things_whats/ Why being the first to move in a new Subdivision is not the most brilliant idea... http://www.reddit.com/r/nosleep/comments/2kw010/why_being_the_first_to_move_in_a_new_subdivision/ I Am Falling. http://www.reddit.com/r/nosleep/comments/2kvxvt/i_am_falling/ Heidi http://www.reddit.com/r/nosleep/comments/2kvrnf/heidi/ I remember everything http://www.reddit.com/r/nosleep/comments/2kvrjs/i_remember_everything/ To Lieutenant Griffin Stone http://www.reddit.com/r/nosleep/comments/2kvm9p/to_lieutenant_griffin_stone/ The woman in my room http://www.reddit.com/r/nosleep/comments/2kvir0/the_woman_in_my_room/ Dr. Margin's Guide to New Monsters: The Guest, or, An Update http://www.reddit.com/r/nosleep/comments/2kvhe5/dr_margins_guide_to_new_monsters_the_guest_or_an/ The Evil Woman (part 5) http://www.reddit.com/r/nosleep/comments/2kva73/the_evil_woman_part_5/ Blood for the blood god, The first of many. http://www.reddit.com/r/nosleep/comments/2kv9gx/blood_for_the_blood_god_the_first_of_many/ An introduction to the beginning of my journey http://www.reddit.com/r/nosleep/comments/2kv8s0/an_introduction_to_the_beginning_of_my_journey/ A hunter..of sorts. http://www.reddit.com/r/nosleep/comments/2kv8oz/a_hunterof_sorts/ Void Trigger http://www.reddit.com/r/nosleep/comments/2kv84s/void_trigger/ What really happened to Amelia Earhart http://www.reddit.com/r/nosleep/comments/2kv80r/what_really_happened_to_amelia_earhart/ I Used To Be Fine Being Alone http://www.reddit.com/r/nosleep/comments/2kv2ks/i_used_to_be_fine_being_alone/ The Green One http://www.reddit.com/r/nosleep/comments/2kuzre/the_green_one/ Elevator http://www.reddit.com/r/nosleep/comments/2kuwxu/elevator/ Scary story told by my 4 year old niece- The Guy With Really Big Scary Claws http://www.reddit.com/r/nosleep/comments/2kuwjz/scary_story_told_by_my_4_year_old_niece_the_guy/ Cranial Nerve Zero http://www.reddit.com/r/nosleep/comments/2kuw7c/cranial_nerve_zero/ Mom's Story About a Ghost Uncle http://www.reddit.com/r/nosleep/comments/2kuvhs/moms_story_about_a_ghost_uncle/ It snowed. http://www.reddit.com/r/nosleep/comments/2kutp6/it_snowed/ The pocket watch I found at a store http://www.reddit.com/r/nosleep/comments/2kusru/the_pocket_watch_i_found_at_a_store/ You’re Going To Die When You Are 23 http://www.reddit.com/r/nosleep/comments/2kur3m/youre_going_to_die_when_you_are_23/ The Customer: Part Two http://www.reddit.com/r/nosleep/comments/2kumac/the_customer_part_two/ Dimenhydrinate http://www.reddit.com/r/nosleep/comments/2kul8e/dimenhydrinate/ • • • • • 12 12 76 4 2 4 6 4 18 2 6 13 5 16 2 2 14 48 1 13

1条回答

网友

1楼 · 发布于 2024-05-19 23:02:49

您可以通过使用class="thing"迭代div元素一次完成（可以将其视为对post的迭代）。对于每个div，获取链接和评级：

from urlparse import urljoin

from bs4 import BeautifulSoup
import requests

def posts_spider():
    url = 'http://www.reddit.com/r/nosleep/new/'
    soup = BeautifulSoup(requests.get(url).content)
    for thing in soup.select('div.thing'):
        link = thing.find('a', {'class': 'title'})
        rating = thing.find('div', {'class': 'score'})
        href = urljoin("http://www.reddit.com", link.get('href'))

        print(link.string, href, rating.string)

posts_spider()

仅供参考，div.thing是一个^{}，它将所有div与class="thing"匹配。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章