我试图在instagram上获得每张图片的帖子描述,但我只得到了描述的一小部分。有人能帮我得到完整的图片帖子描述吗
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
# ---------------- getting hrefs in posts ------------------ #
# Step 1
driver = webdriver.Chrome('/Users/jjcauton/Documents/python/chromedriver')
driver.get('https://www.instagram.com/addict_for_sneakers/')
hrefs = driver.find_elements_by_tag_name('a')
print(hrefs)
hrefs_elem = [elem.get_attribute('href') for elem in hrefs]
hrefs_elem = [href for href in hrefs_elem if '/p/' in href]
print(hrefs_elem)
for href in hrefs_elem:
driver.get(href)
page = requests.get(href)
soup = BeautifulSoup(page.content, 'lxml')
page_contents = soup.title
contents = page_contents.get_text()
print(contents)
结果如下:
Boricua Adicto A Tenis on Instagram: “🔥 Giveaway 🔥 Win a FREE pair of Adidas Yeezy 350 v2 "Yeshaya" (Winner Picks His or Her Size) by following the simple steps below. Here’s…”
Boricua Adicto A Tenis on Instagram: “1,2,3,4,5,6,7,8,9 or 10?
#Tecatodetenis”
Boricua Adicto A Tenis on Instagram: “The the future of sneakers trading is here 👀 Make money by buying shares, then selling them for more than what you paid 💵 Start with only…”
Boricua Adicto A Tenis on Instagram: “What’s your favorite AJ11?”
Boricua Adicto A Tenis on Instagram: “🔥 Giveaway 🔥 Win a FREE pair of Retro 1 Fearless by following the simple steps below. Here’s how you can win🏆: 1️⃣ Follow:…”
Boricua Adicto A Tenis on Instagram: “1,2,3,4,5,6,7,8,9 or 10?
#Tecatodetenis”
Boricua Adicto A Tenis on Instagram: “Choose One!”
Boricua Adicto A Tenis on Instagram: “🔥FREE👟GIVEAWAY🔥 Win the 🅱️red 1️⃣1️⃣ for FREE by following these steps: Step 1️⃣. Follow them👇: @_jsole_ @wallkicksofficial @pr_sneaks23…”
Boricua Adicto A Tenis on Instagram: “What’s your favorite retro 4?”
Boricua Adicto A Tenis on Instagram: “🔥 Giveaway 🔥 Win a FREE pair of Retro 1 Turbo Green by following the simple steps below. Here’s how you can win🏆: 1️⃣ Follow:…”
Boricua Adicto A Tenis on Instagram: “1,2,3,4,5,6,7,8,9 or 10?
#Tecatodetenis”
Boricua Adicto A Tenis on Instagram: “✨LAST CHANCE✨ ☁️CHOOSE YOUR FAVORITE SHOE☁️ ⠀ To Enter Simply: 1️⃣: Like This Picture 2️⃣: Follow @Luisanglcordova @Hypedseason…”
正如你所看到的,它只给出了图片帖子描述的一小部分。我需要完整的描述。谢谢大家!
你找错标签了。Instagram在
<script>
标记中只包含帖子的全文,因此返回所有<a>
标记对您没有帮助。您需要找到包含“edge\u media\u to\u标题”的<script>
标记。脚本标记相当长,但其中包含以下内容(取自Instagram帐户/katyperry/):使用此选项,您可以使用字符串[index1:index2]提取数据,其中可以使用string.find(“某些值”)找到索引
相关问题 更多 >
编程相关推荐