如何在dataframe中将字典值转换为列?

2024-07-04 08:06:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前有一本字典dicts,看起来像这样(代码片段):

{'Axa':          w      x     y     z
0     9.597307   8.533429  43.4  Axa
6     0.000000   4.631714  32.0  Axa
17    0.662168   6.271585  37.7  Axa
..         ...        ...   ...        ...
171   4.023485   9.104185  28.2  Axa
172   0.846931   5.703871  38.8  Axa
174  20.063263   6.436114  27.7  Axa

[66 rows x 4 columns]}
{'Bxa':         w      x    y         z
1     0.454497   5.443401  43.6  Bxa
3     0.086371   4.869583  42.3  Bxa
4     2.264084   7.330367  36.6  Bxa
5     7.312782  12.418908  38.0  Bxa
8    10.935617   1.474324  43.5  Bxa

[29 rows x 4 columns]} 

它是一个带有keys = {Axa, Bxa, Cxa, Dxa}的字典,值将是wz

type(dicts)给出

<class 'dict'>
<class 'dict'>
<class 'dict'>
<class 'dict'>
<class 'dict'>
<class 'dict'>

我想检索如下所示的数据帧:

| w          |    x      |   w     |    z | 
| --------   | ----------| --------| -----|
0  9.597307  | 8.533429  |   43.4  |   Axa|
1  0.000000  |  4.631714 |   32.0  |  Axa |
2  0.662168  |  6.271585 |   37.7  |  Axa |
...
63  4.023485 |  9.104185 |  28.2   |   Axa|
64   0.846931|   5.703871|  38.8   |  Axa |
65  20.063263|   6.436114| 27.7    | Axa  |
67  0.454497 |   5.443401|   43.6  |   Bxa|
68  0.086371 |  4.869583 | 42.3    | Bxa  |
69  2.264084 |   7.330367|  36.6   | Bxa  |

我试过这个:

df = pd.DataFrame(list(dicts.values()), columns = ['w', 'x', 'y', 'z'])

但我明白了:

ValueError: Must pass 2-d input. shape=(1, 66, 4)

通常我们使用key作为列名,使用values作为列值,但在这种情况下,我希望我的values两者都是。我该怎么做

这是我的完整代码:

for ph in data.model.unique():
    
    dicts = {}

    "This loop aims to extract outliers from the dataset using Gaussian mixture models for each phone model and create new df"
    data = data[data.model==ph]
    data = data[['r_var',  'b_var', 'SPAD', 'model']]
    data = data[['r_var', 'b_var']].values

    probs = gmm.score_samples(data)

    probs_mean, probs_sd = mean(probs), std(probs)
    cut_off = probs_sd * 2
    lower, upper = probs_mean - cut_off, probs_mean + cut_off
    not_outliers = data[probs > lower]

    # append to dicts
    dicts[ph] = not_outliers
    df = pd.concat(dicts).reset_index(drop=True)
    print(df)

Tags: columnsdfdatamodelvarmeanphdict
2条回答

如果dict的值是数据帧,则可以使用concat

df = pd.concat(list(dicts.values()),ignore_index=True)

^{}^{}一起使用:

df = pd.concat(dicts).reset_index(drop=True)

编辑:

您的解决方案需要更改dicts = {}concat外部循环:

dicts = {}
for ph in data.model.unique():

    "This loop aims to extract outliers from the dataset using Gaussian mixture models for each phone model and create new df"
    data = data[data.model==ph]
    data = data[['r_var',  'b_var', 'SPAD', 'model']]
    data = data[['r_var', 'b_var']].values

    probs = gmm.score_samples(data)

    probs_mean, probs_sd = mean(probs), std(probs)
    cut_off = probs_sd * 2
    lower, upper = probs_mean - cut_off, probs_mean + cut_off
    not_outliers = dataf[probs > lower]

    # append to dicts
    dicts[ph] = not_outliers
    
df = pd.concat(dicts).reset_index(drop=True)
print(len(df))

相关问题 更多 >

    热门问题