如何更改此代码以刮表

2024-09-28 12:11:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在努力清理此网站的数据:

https://wix-visual-data.appspot.com/app/widget?pageId=cu7nt&compId=comp-kesofw00&viewerCompId=comp-kesofw00&siteRevision=947&viewMode=site&deviceType=desktop&locale=en&tz=Europe%2FLondon&width=980&height=890&instance=k983l1LiiUeOz5_3Pd_CLXbjfadc08q1fEu54xfh9aA.eyJpbnN0YW5jZUlkIjoiYjQ0MWIxMGUtNTRmNy00YzdhLTgwY2QtNmU0ZjkwYzljMzA3IiwiYXBwRGVmSWQiOiIxMzQxMzlmMy1mMmEwLTJjMmMtNjkzYy1lZDIyMTY1Y2ZkODQiLCJtZXRhU2l0ZUlkIjoiM2M3ZmE5OWItY2I3Yy00MTg0LTk1OTEtNWY0MDhmYWYwZmRhIiwic2lnbkRhdGUiOiIyMDIxLTAxLTMwVDAxOjIzOjAyLjU1MVoiLCJ1aWQiOiIzYWMyNDI3YS04NGVhLTQ0ZGUtYjYxMS02MTNiZTVlOWJiZGQiLCJkZW1vTW9kZSI6ZmFsc2UsImFpZCI6IjczYWE3ZWNjLTQyODUtNDY2My1iNjMxLTMzMjE0MWJiZDhhMiIsImJpVG9rZW4iOiI4ODNlMTg5NS05ZjhiLTBkZmUtMTU1Yy0zMTBmMWY2NmNjZGQiLCJzaXRlT3duZXJJZCI6ImVhYWU1MDEzLTMxZjgtNDQzNC04MDFhLTE3NDQ2N2EwZjE5YSIsImV4cGlyYXRpb25EYXRlIjoiMjAyMS0wMS0zMFQwNToyMzowMi41NTFaIiwiaGFzVXNlclJvbGUiOmZhbHNlfQ&currency=GBP&currentCurrency=GBP&vsi=795183b4-8f30-4854-bd85-77678dbe4cf8&consent-policy=%7B%22func%22%3A0%2C%22anl%22%3A0%2C%22adv%22%3A0%2C%22dt3%22%3A1%2C%22ess%22%3A1%7D&commonConfig=%7B%22brand%22%3A%22wix%22%2C%22bsi%22%3Anull%2C%22BSI%22%3Anull%7D

此URL有一个表,但由于某些原因,我无法将其刮到excel文件中。这是我目前用Python编写的代码,也是我尝试过的。非常感谢您的帮助,谢谢您

import urllib
import urllib.request
from bs4 import BeautifulSoup
import requests
import pandas as pd

url = "https://wix-visual-data.appspot.com/app/widget?pageId=cu7nt&compId=comp-kesofw00&viewerCompId=comp-kesofw00&siteRevision=947&viewMode=site&deviceType=desktop&locale=en&tz=Europe%2FLondon&width=980&height=890&instance=dxGyx3zK9ULK0A8UtGOrLw-__FTD9EBEfzQojJ7Bz00.eyJpbnN0YW5jZUlkIjoiYjQ0MWIxMGUtNTRmNy00YzdhLTgwY2QtNmU0ZjkwYzljMzA3IiwiYXBwRGVmSWQiOiIxMzQxMzlmMy1mMmEwLTJjMmMtNjkzYy1lZDIyMTY1Y2ZkODQiLCJtZXRhU2l0ZUlkIjoiM2M3ZmE5OWItY2I3Yy00MTg0LTk1OTEtNWY0MDhmYWYwZmRhIiwic2lnbkRhdGUiOiIyMDIxLTAxLTI5VDE4OjM0OjQwLjgwM1oiLCJ1aWQiOiIzYWMyNDI3YS04NGVhLTQ0ZGUtYjYxMS02MTNiZTVlOWJiZGQiLCJkZW1vTW9kZSI6ZmFsc2UsImFpZCI6IjczYWE3ZWNjLTQyODUtNDY2My1iNjMxLTMzMjE0MWJiZDhhMiIsImJpVG9rZW4iOiI4ODNlMTg5NS05ZjhiLTBkZmUtMTU1Yy0zMTBmMWY2NmNjZGQiLCJzaXRlT3duZXJJZCI6ImVhYWU1MDEzLTMxZjgtNDQzNC04MDFhLTE3NDQ2N2EwZjE5YSIsImV4cGlyYXRpb25EYXRlIjoiMjAyMS0wMS0yOVQyMjozNDo0MC44MDNaIiwiaGFzVXNlclJvbGUiOmZhbHNlfQ&currency=GBP&currentCurrency=GBP&vsi=57130cda-8191-488e-8089-f472928266e3&consent-policy=%7B%22func%22%3A0%2C%22anl%22%3A0%2C%22adv%22%3A0%2C%22dt3%22%3A1%2C%22ess%22%3A1%7D&commonConfig=%7B%22brand%22%3A%22wix%22%2C%22bsi%22%3Anull%2C%22BSI%22%3Anull%7D"
table_id = "theTable"

response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

table = soup.find('table', attrs={"id" : theTable})
df = pd.read_html(str(table))

Tags: httpsimportcomappdatatablewidgetwix

热门问题