堆栈/取消堆栈在python中不保留数据顺序

2024-10-04 01:34:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我在下面有json数据

[[{"value":"ZZ","formattedValue":"ZZ"},{"value":"In","formattedValue":"In"},{"value":"Amount1","formattedValue":"Amount1"},{"value":"100","formattedValue":"100"}],[{"value":"ZZ","formattedValue":"ZZ"},{"value":"In","formattedValue":"In"},{"value":"Amount2","formattedValue":"Amount2"},{"value":"200","formattedValue":"200"}],[{"value":"ZZ","formattedValue":"ZZ"},{"value":"Out","formattedValue":"Out"},{"value":"Amount1","formattedValue":"Amount1"},{"value":"30","formattedValue":"30"}],[{"value":"ZZ","formattedValue":"ZZ"},{"value":"Out","formattedValue":"Out"},{"value":"Amount2","formattedValue":"Amount2"},{"value":"4","formattedValue":"40"}],[{"value":"CC","formattedValue":"CC"},{"value":"In","formattedValue":"In"},{"value":"Amount1","formattedValue":"Amount1"},{"value":"100","formattedValue":"100"}],[{"value":"CC","formattedValue":"CC"},{"value":"In","formattedValue":"In"},{"value":"Amount2","formattedValue":"Amount2"},{"value":"200","formattedValue":"200"}],[{"value":"CC","formattedValue":"CC"},{"value":"Out","formattedValue":"Out"},{"value":"Amount1","formattedValue":"Amount1"},{"value":"30","formattedValue":"30"}],[{"value":"CC","formattedValue":"CC"},{"value":"Out","formattedValue":"Out"},{"value":"Amount2","formattedValue":"Amount2"},{"value":"4","formattedValue":"40"}]]

在表格格式中,它应该如下所示

^{tb1}$

但是,如果我使用下面的python代码转换json,则传入数据的顺序不会保留

data  = 'jsondata'
data  = json.loads(data)
df = pd.DataFrame(data).stack().map(lambda x:x.get('formattedValue')).unstack()
df.columns =['Type','InOut','MeasureName','MeasureValue']
df = df.pivot_table(index=['Type','InOut'],columns=['MeasureName'],values="MeasureValue",aggfunc='sum').reset_index()

其输出如下。如果你看它,尺寸是有序的。我不希望它发生。我必须保持数据输入时的顺序不变。请问如何做到这一点?谢谢

^{tb2}$

Tags: columns数据injsondfdata顺序value
3条回答

还有另一种解决方案:在将数据放入pandas之前清理数据:

data = json.loads(s)
data = list(map(lambda row: [el.get('formattedValue') for el in row], data))
df = pd.DataFrame(data, columns=['Dimension', 'Type', 'Amount1', 'Amount2'])

另一个解决方案:

df = pd.DataFrame(
    [
        {
            "Dimension": subl1[0]["formattedValue"],
            "Type": subl1[1]["formattedValue"],
            "Amount1": subl1[-1]["formattedValue"],
            "Amount2": subl2[-1]["formattedValue"],
        }
        for subl1, subl2 in zip(data[::2], data[1::2])
    ]
)
print(df)

印刷品:

  Dimension Type Amount1 Amount2
0        ZZ   In     100     200
1        ZZ  Out      30      40
2        CC   In     100     200
3        CC  Out      30      40

IIUC:

您可以尝试:

在pivot_表创建变量之前:

uni=df['Type'].unique()

旋转后使用:

df=df.loc[df['Type'].map(dict(zip(uni,range(len(uni))))).sort_values().index]

df的输出:

MeasureName     Type        InOut
  2             ZZ              In
  3             ZZ              Out
  0             CC              In
  1             CC              Out

相关问题 更多 >