迭代字符串列表:错误消息

2024-10-01 17:40:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个csv文件,其中有一列包含url,我想打开它,转换成一个列表,并对每个url进行迭代,但它给了我错误,它说url不是字符串

import requests
from lxml import html, etree
from selenium.webdriver import Chrome, ChromeOptions
import random
import time
from datetime import datetime, timedelta
import csv
import pandas as pd

restaurant_urls_li = pd.read_csv(r"list_saved.csv")
restaurant_urls_li = restaurant_urls_li.iloc[:,-1:]
restaurant_urls_li = restaurant_urls_li.values.tolist()
print(restaurant_urls_li)

[['https://www.tripadvisor.com/Restaurant_Review-g187849-d13392251-Reviews-Kisen_Moscova-Milan_Lombardy.html'], ['https://www.tripadvisor.com/Restaurant_Review-g187849-d17805000-Reviews-Mabuhay_Restaurant-Milan_Lombardy.html'],['https://www.tripadvisor.com/Restaurant_Review-g187849-d13392251-Reviews-Kisen_Moscova-Milan_Lombardy.html'], ['https://www.tripadvisor.com/Restaurant_Review-g187849-d17805000-Reviews-Mabuhay_Restaurant-Milan_Lombardy.html'],['https://www.tripadvisor.com/Restaurant_Review-g187849-d13392251-Reviews-Kisen_Moscova-Milan_Lombardy.html'], ['https://www.tripadvisor.com/Restaurant_Review-g187849-d17805000-Reviews-Mabuhay_Restaurant-Milan_Lombardy.html']]

for restaurant_url in restaurant_urls_li[0:20]:
    print(restaurant_url)
    wd.get(restaurant_url)

错误消息:

InvalidArgumentException                  Traceback (most recent call last)
<ipython-input-34-29d9690b9e31> in <module>()
      1 for restaurant_url in restaurant_urls_li[start:end]:
----> 2     wd.get(restaurant_url)
      3     tree = html.fromstring(wd.page_source)
      4     restaurant = tree.xpath('//div[contains(@id,"taplc_top_info")]')[0]
      5     try:

2 frames
/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
    240                 alert_text = value['alert'].get('text')
    241             raise exception_class(message, screen, stacktrace, alert_text)
--> 242         raise exception_class(message, screen, stacktrace)
    243 
    244     def _value_or_default(self, obj, key, default):

InvalidArgumentException: Message: invalid argument: 'url' must be a string

我看不出有什么不对


Tags: httpsimportcomurlhtmlwwwliurls
1条回答
网友
1楼 · 发布于 2024-10-01 17:40:22

由于每个restaurant_url都是一个包含单个字符串的列表,因此只需访问该列表的第一个元素即可访问该列表

for restaurant_url in restaurant_urls_li[0:20]:
    wd.get(restaurant_url[0])

相关问题 更多 >

    热门问题