如何在python中将列表转换为具有特定列的dataframe?

2024-09-28 18:14:03 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个列表数据,我想把它转换成dataframe,但是有一个特定的列。请帮助我获得正确的输出

范例- 我所拥有的资料-

[{'data': [{'interval': '2021-09-22T09:13:57.000Z/2021-09-29T09:13:57.000Z',
           'metrics': [{'metric': 'nOffered',
                        'qualifier': None,
                        'stats': {'count': 17,
                                  'count_negative': None,
                                  'count_positive': None,
                                  'current': None,
                                  'denominator': None,
                                  'max': None,
                                  'min': None,
                                  'numerator': None,
                                  'ratio': None,
                                  'sum': None,
                                  'target': None}},
                       {'metric': 'tAnswered',
                        'qualifier': None,
                        'stats': {'count': 17,
                                  'count_negative': None,
                                  'count_positive': None,
                                  'current': None,
                                  'denominator': None,
                                  'max': 17327.0,
                                  'min': 4569.0,
                                  'numerator': None,
                                  'ratio': None,
                                  'sum': 156929.0,
                                  'target': None}},
                       {'metric': 'tTalk',
                        'qualifier': None,
                        'stats': {'count': 29,
                                  'count_negative': None,
                                  'count_positive': None,
                                  'current': None,
                                  'denominator': None,
                                  'max': 2650757.0,
                                  'min': 2124.0,
                                  'numerator': None,
                                  'ratio': None,
                                  'sum': 8402252.0,
                                  'target': None}}],
           'views': None}],
 'group': {'mediaType': 'voice',
           'queueId': 'a72dba75-0bc6-4a65-b120-8803364f8dc3'}}]

我必须将其转换为以下格式-

nOffered_count  nOffered_sum   tAnswered_count  tAnswered_sum   tTalk_count   tTalk_sum
17              None           17               156929.0        29            8402252.0

Tags: nonestatscountcurrentminmetricmaxsum
2条回答

这是三个列表中的两个dict

lst1 = data #Data you have with 1 dict (dict1)
dict1 = lst1[0] #( dict_keys(['data', 'group'])
lst2 = dict1['data'] # #Lista 2 com 1 dicts
dict2 = lst2[0] #dict_keys(['interval', 'metrics', 'views'])
lst3 = dict2['metrics'] #list with 3 dicts "mstrics" 
#(nOffered,tAnswered,tTalk)

df = pd.DataFrame()
    for k in lst3:
df = pd.concat([df, pd.DataFrame(k)])

df2 = df.loc[['count','sum']].fillna('None') # Replace "NaN" for "none"


# Concatenate 'metric' and index on 'metric'
df2['metric']  = df2['metric'] +"_"+  df2.index 
df2.set_index('metric',inplace=True) # Take 'metric' as index
df2.drop(['qualifier'], axis=1, inplace = True) # Drop cols
df2.T # Transpose to format  

如果您有多个数据条目,这将有所帮助

lst = [{'data': [{'interval': '2021-09-22T09:13:57.000Z/2021-09-29T09:13:57.000Z',
           'metrics': [{'metric': 'nOffered',
                        'qualifier': None,
                        'stats': {'count': 17,
                                  'count_negative': None,
                                  'count_positive': None,
                                  'current': None,
                                  'denominator': None,
                                  'max': None,
                                  'min': None,
                                  'numerator': None,
                                  'ratio': None,
                                  'sum': None,
                                  'target': None}},
                       {'metric': 'tAnswered',
                        'qualifier': None,
                        'stats': {'count': 17,
                                  'count_negative': None,
                                  'count_positive': None,
                                  'current': None,
                                  'denominator': None,
                                  'max': 17327.0,
                                  'min': 4569.0,
                                  'numerator': None,
                                  'ratio': None,
                                  'sum': 156929.0,
                                  'target': None}},
                       {'metric': 'tTalk',
                        'qualifier': None,
                        'stats': {'count': 29,
                                  'count_negative': None,
                                  'count_positive': None,
                                  'current': None,
                                  'denominator': None,
                                  'max': 2650757.0,
                                  'min': 2124.0,
                                  'numerator': None,
                                  'ratio': None,
                                  'sum': 8402252.0,
                                  'target': None}}],
           'views': None}],
 'group': {'mediaType': 'voice',
           'queueId': 'a72dba75-0bc6-4a65-b120-8803364f8dc3'}}]

column_names = []
final_lst = []
for item in lst:
    data_lst = []
    for data in item['data']:
        for metric in data['metrics']:
            metric_name = metric['metric']
            column_names.append(metric_name+'_count')
            column_names.append(metric_name+'_sum')
            data_lst.append(metric['stats']['count'])
            data_lst.append(metric['stats']['sum'])
    final_lst.append(data_lst)
       
df = pd.DataFrame(final_lst,columns=column_names)
print(df)

   nOffered_count nOffered_sum  tAnswered_count  tAnswered_sum  tTalk_count  
0              17         None               17       156929.0           29   
   tTalk_sum  
0  8402252.0 

相关问题 更多 >