在pandas中，如何将一个具有许多属性和值的列解析为新的列并获取它们的值

df['SourceTechAttributes'][0] 'DropFrame: True, Duration: 4874.1359333333333333333333333, FieldDominance: Upper Field First, FrameRate: 29.97, Height: 1080, MediaFormat: 912, NumberOfAudioChannels: 8, NumberOfAudioTracks: 8, ScanType: Interlaced, StartSmpte: 00:59:59;26, ViewportDisplayFormat: Anamorphic, Width: 1920' 0 DropFrame: True, Duration: 4874.13593333333333... 1 ActionType: CG, DropFrame: True, Duration: 129... 2 DropFrame: True, Duration: 4874.13593333333333... 3 DropFrame: True, Duration: 4874.13593333333333... 4 ActionType: CG, DropFrame: True, Duration: 129... 5 ActionType: CG, DropFrame: True, Duration: 129... Name: SourceTechAttributes, dtype: object

2条回答

网友

1楼 · 编辑于 2024-09-30 06:30:05

以下3个步骤：

# 1. create a list in each row
df['SourceTechAttributes'] = (df['SourceTechAttributes']
                              .apply(lambda x: str(x).replace(" ", "")
                                     .replace(":", ",")
                                     .split(",")))

# 2. create a dictionary in each row
df['SourceTechAttributes'] = (df['SourceTechAttributes']
                              .apply(lambda x: dict(zip(x[::2], x[1::2]))))

# 3. create new columns
df['srcMediaFormat'] = (df['SourceTechAttributes']
                        .apply(lambda x: x['MediaFormat']))

我只创建了一个新列srcMediaFormat作为示例。在

网友

2楼 · 编辑于 2024-09-30 06:30:05

首先，您需要一个函数，它接受一个字符串，然后用逗号和冒号将其拆分，然后通过字典将其转换为pandas系列：

def str2series(s):
    pieces = [x.split(': ') for x in s.split(',')]
    return pd.Series({k.strip(): v.strip() for k,v in pieces})

接下来，将函数应用于列：

^{pr2}$

结果是您要查找的数据帧。如果需要，可以将其与原始数据帧合并：它们具有相同的索引：

df = df.join(new_df)

相关问题更多 >

编程相关推荐

热门问题

热门文章