将pandas数据帧对齐为面板

2024-09-30 03:23:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我有12个相同形状的数据帧,用于12年的数据收集。我需要使用这个面板来绘制时间序列轴(年)上的各种列值。因此,我认为我应该将这些框架作为面板对齐。在

  1. 有人能帮我对齐数据框吗?在
  2. 这是正确的方法来做这准备沿三维绘图吗?在

enter image description here

一些示例数据:

# for 2015
Grave Crimes    Cases Recorded  Mistake of Law fact
Abduction       725             3
Kidnapping      246             6
Arson           466             1
Mischief        436             1
House Breaking  12707           21
Grievous Hurt   1299            3

# for 2016
Grave Crimes    Cases Recorded  Mistake of Law fact
Abduction       738             4
Kidnapping      297             9
Arson           486             4
Mischief        394             1
House Breaking  10287           14
Grievous Hurt   1205            0

# for 2017
Grave Crimes    Cases Recorded  Mistake of Law fact
Abduction       647             2
Kidnapping      251             10
Arson           418             3
Mischief        424             0
House Breaking  8913            12
Grievous Hurt   1075            1

Tags: of数据forhousefactcaseslawrecorded
2条回答

假设数据帧的名称类似于df15、df16、df17,则可以使用它们创建面板,如:

pnl = pd.Panel({2015: df15, 2016: df16, 2017: df17})

之后,你可以用下面的方法做你在问题中提到的3D绘图:

^{pr2}$

example of a 3D-plot of your data

但是,如果我可以从我自己的经验中给你一个关于可读性好的数据可视化的提示,我想许多专业人士会分享:

即使一个数据集是3维或更高维的结构,创建一个设计良好的二维图通常也是一个不错的选择。3D通常会吸引眼球,但为了告知目标受众并显示数据的某些属性,您几乎可以使用2d。考虑到这一点,Ami Tavory的方法将是更好的方法,因为数据结构更易于处理:

df15['year'] = 2015
df16['year'] = 2016
df17['year'] = 2017
df = pd.concat([df15, df16, df17]).set_index(['Grave Crimes', 'year'])

f, ax = plt.subplots(1)
for i, y in enumerate(range(2015, 2018)):
    data = df.groupby('year').get_group(y)['Cases Recorded']
    ax.bar(np.arange(6)+.2*i, data.values, width=.2, label=str(y))
ax.legend()
ax.set_xticklabels(data.index, rotation=15)

example for 2D-plot of your data

虽然面板允许添加维度,但层次索引是一种更常见的替代方法。E、 g.,来自Python Data Science Handbook

While Pandas does provide Panel and Panel4D objects that natively handle three-dimensional and four-dimensional data (see Aside: Panel Data), a far more common pattern in practice is to make use of hierarchical indexing (also known as multi-indexing) to incorporate multiple index levels within a single index. In this way, higher-dimensional data can be compactly represented within the familiar one-dimensional Series and two-dimensional DataFrame objects.

对你来说

I have 12 dataframes of the same shape for 12 years of data collection. I need to use this as a panel to to plot the various column values across the time series axis (years).

假设您的数据帧位于df_2015df_2016和{}。您可以执行以下操作:

df_2015['year'] = 2015
df_2016['year'] = 2016
df_2017['year'] = 2017
df = pd.concat([df_2015, df_2016, df_2017]).set_index(['Grave Crimes', 'year'])

现在要获得'Abduction'所有年份的数据,例如,使用

^{pr2}$

相关问题 更多 >

    热门问题