Python中的Pandas Panel4D重采样

2024-09-26 22:50:05 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一些12小时间隔的四维水文数据。我想用以下代码计算出它的日平均值:

>>> InNcFile = Dataset ( InputFile, 'r' )

>>> Time  = InNcFile.variables['time'][:]

>>> Latitude  = InNcFile.variables['lat'][:]

>>> Longitude = InNcFile.variables['lon'][:]

>>> ZLevel = InNcFile.variables['lvl'][:]

>>> SM = InNcFile.variables['sm'][:,:,:,:]

>>> DateTime = map ( lambda x: datetime.strptime ( x, '%Y%m%d%H%M' ), Time )

>>> df = pandas.Panel4D ( SM, labels = DateTime, items = ZLevel, major_axis = Latitude, minor_axis = Longitude )

>>> SM.shape

(21, 4, 769, 1024)

>>> df_SMoist.shape

(21, 4, 769, 1024)

>>> df_MeanSM = df_SMoist.resample ( 'D', how = 'mean', axis = 0 )

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/projects/access/apps/pythonlib/pandas/0.12.0/pandas-0.12.0-py2.7-linux-x86_64.egg/pandas/core/generic.py", line 290, in resample
    return sampler.resample(self)
  File "/projects/access/apps/pythonlib/pandas/0.12.0/pandas-0.12.0-py2.7-linux-x86_64.egg/pandas/tseries/resample.py", line 83, in resample
    rs = self._resample_timestamps(obj)
  File "/projects/access/apps/pythonlib/pandas/0.12.0/pandas-0.12.0-py2.7-linux-x86_64.egg/pandas/tseries/resample.py", line 209, in _resample_timestamps
    grouped = obj.groupby(grouper, axis=self.axis)
  File "/projects/access/apps/pythonlib/pandas/0.12.0/pandas-0.12.0-py2.7-linux-x86_64.egg/pandas/core/panelnd.py", line 111, in func
    raise NotImplementedError
NotImplementedError

现在,如果我把SM数组变成三维的,只有一个ZLevel(也就是说,使用Panel而不是Panel4D),它可以正常工作。你能帮我找出我做错了什么吗?在

谢谢。在


Tags: appsinpandasdfaccesslinuxlinevariables
1条回答
网友
1楼 · 发布于 2024-09-26 22:50:05

Panel4Ds还没有(还?)像DataFrames一样拥有功能丰富的API。你可以的 通过将四维数据加载到二维数据来解决这个问题 数据帧with a MultiIndex。在

例如,如果您的SMdateszlevellatitude和{}的外观 像这样:

import numpy as np
import pandas as pd

shape = (5,2,3,4)
SM = np.arange(np.prod(shape)).reshape(shape)
dates = pd.date_range('2000-1-1', periods=shape[0], freq='12H')
zlevel = np.arange(shape[1])
lat = np.arange(shape[2])
lng = np.arange(shape[3])

然后,您可以使用这样的多索引构建一个数据帧:

^{pr2}$

要按日期重新采样,索引必须是DatetimeIndex、TimedeltaIndex或PeriodIndex,而不是多重索引。因此,我们需要将zlevellat和{}索引级别移到列中:

df = df.unstack(['zlevel', 'lat', 'long'])

现在df看起来像

In [87]: df
Out[87]: 
                      0                                           ...        \
zlevel                0                                           ...     1   
lat                   0                1                   2      ...     0   
long                  0   1   2   3    0    1    2    3    0    1 ...     2   
dates                                                             ...         
2000-01-01 00:00:00   0   1   2   3    4    5    6    7    8    9 ...    14   
2000-01-01 12:00:00  24  25  26  27   28   29   30   31   32   33 ...    38   
2000-01-02 00:00:00  48  49  50  51   52   53   54   55   56   57 ...    62   
2000-01-02 12:00:00  72  73  74  75   76   77   78   79   80   81 ...    86   
2000-01-03 00:00:00  96  97  98  99  100  101  102  103  104  105 ...   110   


zlevel                                                            
lat                         1                   2                 
long                   3    0    1    2    3    0    1    2    3  
dates                                                             
2000-01-01 00:00:00   15   16   17   18   19   20   21   22   23  
2000-01-01 12:00:00   39   40   41   42   43   44   45   46   47  
2000-01-02 00:00:00   63   64   65   66   67   68   69   70   71  
2000-01-02 12:00:00   87   88   89   90   91   92   93   94   95  
2000-01-03 00:00:00  111  112  113  114  115  116  117  118  119  

[5 rows x 24 columns]

现在我们可以重新采样日期:

In [88]: df.resample('D', how='mean', axis=0)
Out[88]: 
             0                                           ...                  \
zlevel       0                                           ...     1             
lat          0                1                   2      ...     0         1   
long         0   1   2   3    0    1    2    3    0    1 ...     2    3    0   
dates                                                    ...                   
2000-01-01  12  13  14  15   16   17   18   19   20   21 ...    26   27   28   
2000-01-02  60  61  62  63   64   65   66   67   68   69 ...    74   75   76   
2000-01-03  96  97  98  99  100  101  102  103  104  105 ...   110  111  112   


zlevel                                         
lat                          2                 
long          1    2    3    0    1    2    3  
dates                                          
2000-01-01   29   30   31   32   33   34   35  
2000-01-02   77   78   79   80   81   82   83  
2000-01-03  113  114  115  116  117  118  119  

[3 rows x 24 columns]

相关问题 更多 >

    热门问题