Python中的正则表达式不起作用

2024-05-03 09:46:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在编写一个Python脚本,可以从Flickr和其他网站下载图像。我使用flickrapi提取我要下载的图像的各种大小,并确定原始大小的URL。嗯,这就是我想做的。这是我目前的密码。。。你知道吗

URL = {a Flickr link}

flickr = re.match(r".*flickr\.com\/photos\/([^\/]+)\/([0-9^\/]+)\/", URL)
URL = "https://api.flickr.com/services/rest/?method=flickr.photos.getSizes&api_key=6002c84e96ff95c1a861eafafa4284ba&photo_id=" + flickr.group(2) + "&format=json&nojsoncallback=1"

request = requests.get(URL)
result = request.text

parsed = re.match(r".\"Original\".*\"source\"\: \"([^\"]+)", result)
URL = parsed.group(1)

在我的代码中使用print()语句,我知道第一个正则表达式(解析原始Flickr URL以标识照片ID)工作正常,API请求工作正常,返回以下结果(使用示例URL https://www.flickr.com/photos/matbellphotography/33413612735/sizes/h/)。。。你知道吗

{ "sizes": { "canblog": 0, "canprint": 0, "candownload": 1, 
"size": [
  { "label": "Square", "width": 75, "height": 75, "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_s.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/sq\/", "media": "photo" },
  { "label": "Large Square", "width": "150", "height": "150", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_q.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/q\/", "media": "photo" },
  { "label": "Thumbnail", "width": 100, "height": 67, "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_t.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/t\/", "media": "photo" },
  { "label": "Small", "width": "240", "height": "160", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_m.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/s\/", "media": "photo" },
  { "label": "Small 320", "width": "320", "height": "213", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_n.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/n\/", "media": "photo" },
  { "label": "Medium", "width": "500", "height": "333", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/m\/", "media": "photo" },
  { "label": "Medium 640", "width": "640", "height": "427", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_z.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/z\/", "media": "photo" },
  { "label": "Medium 800", "width": "800", "height": "534", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_c.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/c\/", "media": "photo" },
  { "label": "Large", "width": "1024", "height": "683", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_b.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/l\/", "media": "photo" },
  { "label": "Large 1600", "width": "1600", "height": "1067", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_4d92e2f70d_h.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/h\/", "media": "photo" },
  { "label": "Large 2048", "width": "2048", "height": "1365", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_81441ed1da_k.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/k\/", "media": "photo" },
  { "label": "Original", "width": "5760", "height": "3840", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_34cbc172c1_o.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/o\/", "media": "photo" }
] }, "stat": "ok" }

我的密码显然在那之后坏了。第二个正则表达式用于标识原始文件大小的图像的下载URL,显然找不到任何匹配项。根据另一个print()语句。。。你知道吗

parsed.group(1) = none

我使用RegExr设置表达式,它准确地确定了我需要从JSON结果中得到什么。我做错了什么?你知道吗


Tags: httpscomsourcewwwwidthlabelflickrjpg
1条回答
网友
1楼 · 发布于 2024-05-03 09:46:42

也许你的requests.Response对象有一个json属性,你可以直接访问它。如果没有,只需import json,解析你的request.content,并使用返回的字典。示例:

>>> import json
>>> json_response = """
... { "sizes": { "canblog": 0, "canprint": 0, "candownload": 1, 
... "size": [
...   { "label": "Square", "width": 75, "height": 75, "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_s.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/sq\/", "media": "photo" },
...   { "label": "Large Square", "width": "150", "height": "150", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_q.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/q\/", "media": "photo" },
...   { "label": "Thumbnail", "width": 100, "height": 67, "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_t.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/t\/", "media": "photo" },
...   { "label": "Small", "width": "240", "height": "160", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_m.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/s\/", "media": "photo" },
...   { "label": "Small 320", "width": "320", "height": "213", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_n.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/n\/", "media": "photo" },
...   { "label": "Medium", "width": "500", "height": "333", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/m\/", "media": "photo" },
...   { "label": "Medium 640", "width": "640", "height": "427", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_z.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/z\/", "media": "photo" },
...   { "label": "Medium 800", "width": "800", "height": "534", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_c.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/c\/", "media": "photo" },
...   { "label": "Large", "width": "1024", "height": "683", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_b.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/l\/", "media": "photo" },
...   { "label": "Large 1600", "width": "1600", "height": "1067", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_4d92e2f70d_h.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/h\/", "media": "photo" },
...   { "label": "Large 2048", "width": "2048", "height": "1365", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_81441ed1da_k.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/k\/", "media": "photo" },
...   { "label": "Original", "width": "5760", "height": "3840", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_34cbc172c1_o.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/o\/", "media": "photo" }
... ] }, "stat": "ok" }"""
>>> 
>>> json_parsed = json.loads(json_response)
>>> for img in json_parsed["sizes"]["size"]:
...     print img.get("source")
... 
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_s.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_q.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_t.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_m.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_n.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_z.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_c.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_b.jpg
https://farm3.staticflickr.com/2855/33413612735_4d92e2f70d_h.jpg
https://farm3.staticflickr.com/2855/33413612735_81441ed1da_k.jpg
https://farm3.staticflickr.com/2855/33413612735_34cbc172c1_o.jpg
>>> 

相关问题 更多 >