如何清除数据帧中的图像格式?

2024-10-03 09:19:58 发布

您现在位置:Python中文网/ 问答频道 /正文

我获取了一些图像并存储在数据帧列df['images']

目前,图像的获取格式如下-

['image1.jpg','image2.jpg','','','']

现在我需要移除括号和“”,如下所示-

image1.jpg,image2.jpg

我尝试了以下功能,但它不工作-

def clean_images(imagearray):
for ch in ['[', ']',', ''',', ,']:
    if ch in imagearray:
        imagearray = string.replace(ch, "")
        print(imagearray)
return imagearray

我的数据框如下所示- enter image description here

有人能告诉我实现这一目标的正确方法吗?你知道吗

以下是本文的内容测向头().to_dict()-

{'Available': {0: 33, 1: 22, 2: 12, 3: 12, 4: 11}, 'Images': {0: ['https://example.com/e1e619ab5f11ffe311db03eefad5a2f4.jpg', 'https://example.com/7edc2e3cda8b63591bfacda9e254ad08.jpg', 'https://example.com/7ed2b44335f73cabe0411819820e4d0b.jpg', 'https://example.com/82fed0e56c531cde2fcf5b98f7418a6a.jpg', 'https://example.com/f536c423a97d0c9ab8c488a453818780.jpg', '', '', ''], 1: ['https://example.com/7d63597ae7a75b8481d9d4318951d6c1.jpg', '', '', '', '', '', '', ''], 2: ['https://example.com/7476c30281056d6810787c617fb4f30e.jpg', 'https://example.com/d59266704fa3f9750c02ea79956acf1e.jpg', '', '', '', '', '', ''], 3: ['https://example.com/7476c30281056d6810787c617fb4f30e.jpg', 'https://example.com/af285804c936cd3278cb2982b6f7a089.jpg', '', '', '', '', '', ''], 4: ['https://example.com/e4b6927a6bf8ad48394534c657ea0994.jpg', 'https://example.com/e630996c631e35013be0fbe0c0113fc5.jpg', '', '', '', '', '', '']}, 'SellerSku': {0: 'SCF285/01', 1: 'Munchkin Multi Forks and Spoons set', 2: 'TR0324-GB01-Fairy', 3: 'TR0323-GB01-Police Car', 4: 'DKLAN 24 -Off White'}, 'ShopSku': {0: '235588426_SGAMZ-361374143', 1: '234623934_SGAMZ-359543733', 2: '235653608_SGAMZ-361464759', 3: '235653608_SGAMZ-361464758', 4: '234907012_SGAMZ-359972591'}, 'SkuId': {0: 361374143, 1: 359543733, 2: 361464759, 3: 361464758, 4: 359972591}, 'Status': {0: 'active', 1: 'active', 2: 'active', 3: 'active', 4: 'active'}, 'Url': {0: 'https://example.com/-i235588426-s361374143.html', 1: 'https://example.com/-i234623934-s359543733.html', 2: 'https://example.com/-i235653608-s361464759.html', 3: 'https://example.com/-i235653608-s361464758.html', 4: 'https://example.com/-i234907012-s359972591.html'}, '_compatible_variation_': {0: 'SCF285/01', 1: 'Multicolor', 2: 'Fairy', 3: 'Police Car', 4: 'Off White'}, 'color_family': {0: 'SCF285/01', 1: 'Multicolor', 2: 'Fairy', 3: 'Police Car', 4: 'Off White'}, 'color_thumbnail': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}, 'package_content': {0: 'Avent 3 in 1 electric steam sterilizer x1', 1: 'Multi Forks and Spoons x1', 2: 'Trunki kid suitcase luggage x1', 3: 'Trunki kid suitcase luggage x1', 4: 'DKLAN 24 Bicycle x1'}, 'package_height': {0: '1', 1: '1', 2: '13', 3: '13', 4: '1'}, 'package_length': {0: '1', 1: '1', 2: '12', 3: '12', 4: '1'}, 'package_weight': {0: '1', 1: '999', 2: '1', 3: '1', 4: '1000'}, 'package_width': {0: '11', 1: '1', 2: '11', 3: '11', 4: '1'}, 'price': {0: 109.0, 1: 8.9, 2: 80.91, 3: 80.91, 4: 178.0}, 'quantity': {0: 33, 1: 22, 2: 12, 3: 12, 4: 11}, 'special_from_date': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}, 'special_from_time': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}, 'special_price': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0}, 'special_time_format': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}, 'special_to_date': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}, 'special_to_time': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}}

Tags: toinhttpscompackageexamplehtmlnan
1条回答
网友
1楼 · 发布于 2024-10-03 09:19:58

使用^{}^{}可以相当干净地完成这项工作:

import os
from ast import literal_eval

def formatter(x):
    return ','.join(list(filter(None, map(os.path.basename, x))))

res = s.apply(literal_eval).apply(formatter)

print(res)

0    img1.jpg,img2.jpg
1    img3.jpg,img4.jpg
2    img5.jpg,img6.jpg
dtype: object

设置

s = pd.Series(["['http://www.test.com/img1.jpg','http://www.test.com/img2.jpg','','','']",
               "['http://www.test.com/img3.jpg','http://www.test.com/img4.jpg','','','','']",
               "['http://www.test.com/img5.jpg','http://www.test.com/img6.jpg','','','','','']"])

更新示例

import os, pandas as pd

d = {'Available': {0: 33, 1: 22, 2: 12, 3: 12, 4: 11}, 'Images': {0: ['https://example.com/e1e619ab5f11ffe311db03eefad5a2f4.jpg', 'https://example.com/7edc2e3cda8b63591bfacda9e254ad08.jpg', 'https://example.com/7ed2b44335f73cabe0411819820e4d0b.jpg', 'https://example.com/82fed0e56c531cde2fcf5b98f7418a6a.jpg', 'https://example.com/f536c423a97d0c9ab8c488a453818780.jpg', '', '', ''], 1: ['https://example.com/7d63597ae7a75b8481d9d4318951d6c1.jpg', '', '', '', '', '', '', ''], 2: ['https://example.com/7476c30281056d6810787c617fb4f30e.jpg', 'https://example.com/d59266704fa3f9750c02ea79956acf1e.jpg', '', '', '', '', '', ''], 3: ['https://example.com/7476c30281056d6810787c617fb4f30e.jpg', 'https://example.com/af285804c936cd3278cb2982b6f7a089.jpg', '', '', '', '', '', ''], 4: ['https://example.com/e4b6927a6bf8ad48394534c657ea0994.jpg', 'https://example.com/e630996c631e35013be0fbe0c0113fc5.jpg', '', '', '', '', '', '']}, 'SellerSku': {0: 'SCF285/01', 1: 'Munchkin Multi Forks and Spoons set', 2: 'TR0324-GB01-Fairy', 3: 'TR0323-GB01-Police Car', 4: 'DKLAN 24 -Off White'}, 'ShopSku': {0: '235588426_SGAMZ-361374143', 1: '234623934_SGAMZ-359543733', 2: '235653608_SGAMZ-361464759', 3: '235653608_SGAMZ-361464758', 4: '234907012_SGAMZ-359972591'}, 'SkuId': {0: 361374143, 1: 359543733, 2: 361464759, 3: 361464758, 4: 359972591}, 'Status': {0: 'active', 1: 'active', 2: 'active', 3: 'active', 4: 'active'}, 'Url': {0: 'https://example.com/-i235588426-s361374143.html', 1: 'https://example.com/-i234623934-s359543733.html', 2: 'https://example.com/-i235653608-s361464759.html', 3: 'https://example.com/-i235653608-s361464758.html', 4: 'https://example.com/-i234907012-s359972591.html'}, '_compatible_variation_': {0: 'SCF285/01', 1: 'Multicolor', 2: 'Fairy', 3: 'Police Car', 4: 'Off White'}, 'color_family': {0: 'SCF285/01', 1: 'Multicolor', 2: 'Fairy', 3: 'Police Car', 4: 'Off White'}, 'package_content': {0: 'Avent 3 in 1 electric steam sterilizer x1', 1: 'Multi Forks and Spoons x1', 2: 'Trunki kid suitcase luggage x1', 3: 'Trunki kid suitcase luggage x1', 4: 'DKLAN 24 Bicycle x1'}, 'package_height': {0: '1', 1: '1', 2: '13', 3: '13', 4: '1'}, 'package_length': {0: '1', 1: '1', 2: '12', 3: '12', 4: '1'}, 'package_weight': {0: '1', 1: '999', 2: '1', 3: '1', 4: '1000'}, 'package_width': {0: '11', 1: '1', 2: '11', 3: '11', 4: '1'}, 'price': {0: 109.0, 1: 8.9, 2: 80.91, 3: 80.91, 4: 178.0}, 'quantity': {0: 33, 1: 22, 2: 12, 3: 12, 4: 11}, 'special_price': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0}}

df = pd.DataFrame.from_dict(d)

def formatter(x):
    return ','.join(list(filter(None, map(os.path.basename, x))))

df['Images'] = df['Images'].apply(formatter)

print(df['Images'])

0    e1e619ab5f11ffe311db03eefad5a2f4.jpg,7edc2e3cd...
1                 7d63597ae7a75b8481d9d4318951d6c1.jpg
2    7476c30281056d6810787c617fb4f30e.jpg,d59266704...
3    7476c30281056d6810787c617fb4f30e.jpg,af285804c...
4    e4b6927a6bf8ad48394534c657ea0994.jpg,e630996c6...
Name: Images, dtype: object

相关问题 更多 >