如何在instagram上获得帖子描述?

2024-09-29 21:26:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图在instagram上获得每张图片的帖子描述,但我只得到了描述的一小部分。有人能帮我得到完整的图片帖子描述吗

import requests
from bs4 import BeautifulSoup
from selenium import webdriver

# ---------------- getting hrefs in posts ------------------ #
# Step 1
driver = webdriver.Chrome('/Users/jjcauton/Documents/python/chromedriver')
driver.get('https://www.instagram.com/addict_for_sneakers/')


hrefs = driver.find_elements_by_tag_name('a')
print(hrefs)
hrefs_elem = [elem.get_attribute('href') for elem in hrefs]
hrefs_elem = [href for href in hrefs_elem if '/p/' in href]
print(hrefs_elem)

for href in hrefs_elem:
    driver.get(href)
    page = requests.get(href)
    soup = BeautifulSoup(page.content, 'lxml')
    page_contents = soup.title
    contents = page_contents.get_text()
    print(contents)

结果如下:

Boricua Adicto A Tenis on Instagram: “🔥 Giveaway 🔥  Win a FREE pair of Adidas Yeezy 350 v2 "Yeshaya" (Winner Picks His or Her Size) by following the simple steps below.  Here’s…”

Boricua Adicto A Tenis on Instagram: “1,2,3,4,5,6,7,8,9 or 10?
#Tecatodetenis”

Boricua Adicto A Tenis on Instagram: “The the future of sneakers trading is here 👀  Make money by buying shares, then selling them for more than what you paid 💵  Start with only…”

Boricua Adicto A Tenis on Instagram: “What’s your favorite AJ11?”

Boricua Adicto A Tenis on Instagram: “🔥 Giveaway 🔥  Win a FREE pair of Retro 1 Fearless  by following the simple steps below.  Here’s how you can win🏆: 1️⃣ Follow:…”

Boricua Adicto A Tenis on Instagram: “1,2,3,4,5,6,7,8,9 or 10?
#Tecatodetenis”

Boricua Adicto A Tenis on Instagram: “Choose One!”

Boricua Adicto A Tenis on Instagram: “🔥FREE👟GIVEAWAY🔥  Win the 🅱️red 1️⃣1️⃣ for FREE by following these steps:  Step 1️⃣. Follow them👇: @_jsole_ @wallkicksofficial @pr_sneaks23…”

Boricua Adicto A Tenis on Instagram: “What’s your favorite retro 4?”

Boricua Adicto A Tenis on Instagram: “🔥 Giveaway 🔥  Win a FREE pair of Retro 1 Turbo Green by following the simple steps below.  Here’s how you can win🏆: 1️⃣ Follow:…”

Boricua Adicto A Tenis on Instagram: “1,2,3,4,5,6,7,8,9 or 10?
#Tecatodetenis”

Boricua Adicto A Tenis on Instagram: “✨LAST CHANCE✨ ☁️CHOOSE YOUR FAVORITE SHOE☁️ ⠀ To Enter Simply: 1️⃣: Like This Picture 2️⃣: Follow  @Luisanglcordova @Hypedseason…”

正如你所看到的,它只给出了图片帖子描述的一小部分。我需要完整的描述。谢谢大家!


Tags: theinfreeforgetbyondriver
1条回答
网友
1楼 · 发布于 2024-09-29 21:26:11

你找错标签了。Instagram在<script>标记中只包含帖子的全文,因此返回所有<a>标记对您没有帮助。您需要找到包含“edge\u media\u to\u标题”的<script>标记。脚本标记相当长,但其中包含以下内容(取自Instagram帐户/katyperry/):

"edge_media_to_caption": {
                             "edges": [{
                                 "node": {
                                     "text": "Many people wonder how the pyramids were actually built... but me, I am in constant awe and wonder of how such a loving/kind/compassionate/supportive/talented/deeply spiritual/did I mention incredibly good looking/James Bond of a human being can actually exist in the flesh!\n\nThere\u2019s a reason why all animals and children run straight into his arms... It\u2019s his heart, so pure. I love you Orlando Jonathan Blanchard Copeland Bloom. Happiest 43rd year. \u2665\ufe0f\ud83c\udf82\u2660\ufe0f"
                                 }
                             }]
                         },

使用此选项,您可以使用字符串[index1:index2]提取数据,其中可以使用string.find(“某些值”)找到索引

相关问题 更多 >

    热门问题