将多项式拟合到大型CMIP6 NetCDF xarray的块?

2024-10-01 00:20:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我的目标是计算各种CMIP6模型的piControl运行的“zos”(海平面)数据的时间维度上的二次拟合,并从其他运行中减去。在此特定示例中,数据如下所示:

Out[108]: 
<xarray.Dataset>
Dimensions:    (bnds: 2, ncells: 830305, time: 3, vertices: 16)
Coordinates:
  * time       (time) object 2401-01-16 12:00:00 ... 2401-03-16 12:00:00
    lon        (ncells) float64 dask.array<chunksize=(5000,), meta=np.ndarray>
    lat        (ncells) float64 dask.array<chunksize=(5000,), meta=np.ndarray>
Dimensions without coordinates: bnds, ncells, vertices
Data variables:
    time_bnds  (time, bnds) object dask.array<chunksize=(3, 2), meta=np.ndarray>
    lon_bnds   (ncells, vertices) float64 dask.array<chunksize=(5000, 16), meta=np.ndarray>
    lat_bnds   (ncells, vertices) float64 dask.array<chunksize=(5000, 16), meta=np.ndarray>
    zos        (time, ncells) float32 dask.array<chunksize=(3, 5000), meta=np.ndarray>

我尝试使用xarray.Dataset.polyfit(),但这会导致AttributeError:“Dataset”对象没有属性“polyfit”。这可能是因为CMIP6型号的日历和型号年份不规则,超出了支持的范围(例如,“允许将时间轴解码为完整的numpy.datetime64对象,继续使用cftime.datetime对象,原因:日期超出范围”)。这也是我在下面的代码片段中使用encode_times=False的原因

或者,我尝试了this solution

from numpy.polynomial.polynomial import polyval,polyfit import pandas as pd import xarray as xr import dask.array as da import numpy as np xr.set_options(display_style="html") # fancy HTML repr mycmip6 = ( xr.open_dataset('/Volumes/Naamloos/PhD_Data/CMIP6/raw_mergetime/zos/AWI-CM-1-1-MR/zos_Omon_AWI-CM-1-1-MR_piControl_r1i1p1f1_gn_240101-290012.nc',decode_times=False,chunks={'ncells':5000}) .isel(time=slice(3)) #using a small slice for example ) mycmip6 def fit_quadratic(data, time): pfit = np.polyfit(time, data,2) return np.transpose( np.polyval(pfit,time) ) pfit = xr.apply_ufunc( fit_quadratic, # first the function mycmip6.zos, #sea level data mycmip6.time, # time input_core_dims=[["time"], ["time"]], # list with one entry per arg output_core_dims=[["time"]], # returned data has one dimension vectorize=True, # loop over non-core dims dask="parallelized", output_dtypes=[mycmip6.zos.dtype], )
这很好,但只要我尝试操作pfit或将其保存到netcdf,
pfit.compute()

我遇到了以下错误:

TypeError: wrong number of positional arguments: expected 2, got 3

当我根本不使用分块时,我可以使用“pfit”,但对于最大的数据集,我会遇到内存问题

我做错了什么


Tags: importtimenparraymetadaskndarrayfloat64