从多个文件动态创建xarray数据集

2024-10-01 09:30:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从几个输入文件创建一个xarray数据集。 每个文件有1个时间戳、1个级别和1个源。 具有2个时间戳、2个级别和3个源的示例:

d1 = np.arange(0,9).reshape((3,3))   # 01:00, 3m, SG1 -> 00001-101.con
d2 = np.arange(10,19).reshape((3,3)) # 01:00, 3m, SG2 -> 00001-102.con
d3 = np.arange(20,29).reshape((3,3)) # 01:00, 3m, SG3 -> 00001-103.con
d4 = np.arange(60,69).reshape((3,3)) # 01:00, 10m, SG1 -> 00001-201.con
d5 = np.arange(70,79).reshape((3,3)) # 01:00, 10m, SG2 -> 00001-202.con
d6 = np.arange(80,89).reshape((3,3)) # 01:00, 10m, SG3 -> 00001-203.con

e1 = np.arange(100,109).reshape((3,3)) # 02:00, 3m, SG1 -> 00002-101.con
e2 = np.arange(110,119).reshape((3,3)) # 02:00, 3m, SG2 -> 00002-102.con
e3 = np.arange(120,129).reshape((3,3)) # 02:00, 3m, SG3 -> 00002-103.con
e4 = np.arange(160,169).reshape((3,3)) # 02:00 10m, SG1 -> 00002-201.con
e5 = np.arange(170,179).reshape((3,3)) # 02:00 10m, SG2 -> 00002-202.con
e6 = np.arange(180,189).reshape((3,3)) # 02:00 10m, SG3 -> 00002-203.con


dstk = np.stack((d1,d2,d3,d4,d5,d6)) # 01:00, both levels all sgs -> 00001-101.con - 00001-203.con
estk = np.stack((e1,e2,e3,e4,e5,e6)) # 02:00, both levels all sgs -> 00002-101.con - 00002-203.con

我设法以我需要的方式手动创建数据集,如下所示:

xx = [100,200,300]
yy = [600,700,800]

dds1 = xr.Dataset(data_vars={"SG1":(("x","y"),dstk[0]),"SG2":(("x","y"),dstk[1]),"SG3":(("x","y"),dstk[2])},coords={"x":xx,"y":yy,"lvl":"3m","t":pd.Timestamp("2010-01-01 01:00:00")})
dds2 = xr.Dataset(data_vars={"SG1":(("x","y"),dstk[3]),"SG2":(("x","y"),dstk[4]),"SG3":(("x","y"),dstk[5])},coords={"x":xx,"y":yy,"lvl":"10m","t":pd.Timestamp("2010-01-01 01:00:00")})

eds1 = xr.Dataset(data_vars={"SG1":(("x","y"),estk[3]),"SG2":(("x","y"),estk[4]),"SG3":(("x","y"),estk[5])},coords={"x":xx,"y":yy,"lvl":"3m","t":pd.Timestamp("2010-01-01 02:00:00")})
eds2 = xr.Dataset(data_vars={"SG1":(("x","y"),estk[3]),"SG2":(("x","y"),estk[4]),"SG3":(("x","y"),estk[5])},coords={"x":xx,"y":yy,"lvl":"10m","t":pd.Timestamp("2010-01-01 02:00:00")})

td1 = xr.concat([dds1,dds2],dim="lvl")
td2 = xr.concat([eds1,eds2],dim="lvl")

td_final = xr.concat([td1,td2],dim="t")

这让我想到:

<xarray.Dataset>
Dimensions:  (lvl: 2, t: 2, x: 3, y: 3)
Coordinates:
  * x        (x) int32 100 200 300
  * y        (y) int32 600 700 800
  * lvl      (lvl) <U3 '3m' '10m'
  * t        (t) datetime64[ns] 2010-01-01T01:00:00 2010-01-01T02:00:00
Data variables:
    SG1      (t, lvl, x, y) int32 0 1 2 3 4 5 6 7 8 60 61 62 63 64 65 66 67 ...
    SG2      (t, lvl, x, y) int32 10 11 12 13 14 15 16 17 18 70 71 72 73 74 ...
    SG3      (t, lvl, x, y) int32 20 21 22 23 24 25 26 27 28 80 81 82 83 84 ...

然而,这似乎太复杂了,我想动态地创建数据集,例如循环时间戳、源组和级别列表。你知道吗


Tags: npcondatasetxxdstkxryyint32