在PyMC3中处理大型数据集时“括号嵌套级别超过最大值”

2024-09-24 22:20:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我在PyMC3中有一个递归模型,我正在大约600个时间步上训练它。我发现了错误

Exception: ('Compilation failed (return status=1): /Users/tiwalayoaina/.theano/compiledir_macOS-10.14.6-x86_64-i386-64bit-i386-3.8.8-64/tmp1dgcmqm3/mod.cpp:30412:32: fatal error: bracket nesting level exceeded maximum of 256.         if (!PyErr_Occurred()) {.                                ^. /Users/actinidia/.theano/compiledir_macOS-10.14.6-x86_64-i386-64bit-i386-3.8.8-64/tmp1dgcmqm3/mod.cpp:30412:32: note: use -fbracket-depth=N to increase maximum nesting level. 1 error generated.. ', "FunctionGraph(MakeVector{dtype='float64'}(V0, <TensorType(float64, scalar)>, <TensorType(float64, scalar)>,..., <TensorType(float64, scalar)>, <TensorType(float64, scalar)>))")

当数据长度为10个时间步时,它会起作用,因此我认为是大数据集导致了这个问题。这是我的模型:

daily = [insert time series here; length 600]
mu = pm.Normal("mu", mu=0.01, sigma=0.07)
mu_y = pm.Normal("mu_y", mu=0, sigma=1)
sig_y = pm.HalfNormal("sig_y", sigma=1)
lamb = pm.Beta("lambda", alpha=2, beta=40)
sig_v = pm.TruncatedNormal("sig_v", mu=0.5, sigma=0.2, lower=0)
rho_j = pm.TruncatedNormal("rho_j", mu=0, sigma=0.3, lower=-1, upper=1)
mu_v = pm.HalfNormal("mu_v", sigma=1)
rho = pm.TruncatedNormal("rho", mu=0, sigma=0.3, lower=-1, upper=1)

epsSigma = tt.stack([1.0, rho, rho, 1.0]).reshape((2, 2))
eps = [pm.MvNormal("eps"+str(i), mu=np.zeros(2), cov=epsSigma, shape=2) for i in range(len(daily))]
alphabeta = pm.MvNormal("alphabeta", mu=np.zeros(2), cov=np.eye(2), shape=2)

Zv = pm.Exponential("Zv", lam=mu_v, dims="date")

Zy_mu = mu_y + rho_j * Zv
Zy = pm.Normal('Zy', mu=Zy_mu, sigma=sig_y, dims='date')

J = pm.Bernoulli("J", p=lamb, dims="date")


# this is where the problems start



V = [i for i in range(len(daily))]
V[0] = pm.TruncatedNormal("V0", mu=alphabeta[0], sigma=1, lower=0)
for t in range(1, len(V)):
    V[t] = alphabeta[0] + alphabeta[1] * V[t-1]
    V[t] = V[t] + sig_v * eps[t][1]
    V[t] = V[t] * pm.math.sqrt(10**-8 + pm.math.maximum(V[t-1], 0)) 
    V[t] = V[t] + J[t] * Zv[t]
V = pm.Normal("V", mu=tt.stack(V), sigma=0.05, dims="date")

Y = [i for i in range(len(daily))]
Y[0] = pm.Normal("YO", mu=mu, sigma=0.1)
for t in range(1, len(Y)):
    Y[t] = mu + pm.math.sqrt(10**-8 + pm.math.maximum(V[t-1], 0)) * eps[t][1] + J[t] * Zy[t]
Y_obs = pm.Normal("Y_obs", mu=tt.stack(Y), sigma=0.05, dims="date", observed=daily_obs)

this post来看,问题似乎在于用于定义长度为600的向量VY的for循环,但鉴于递归关系的性质,似乎不可能在没有循环的情况下进行定义(为了可读性,它们在LaTeX中):

enter image description here

有没有更好的方法来定义这些变量


Tags: infordatelenrangesigmadailysig
1条回答
网友
1楼 · 发布于 2024-09-24 22:20:07

在执行以下操作后,我没有收到异常:

  1. 不要将eps写入一维数组的列表,而是将其定义为二维数组
eps = pm.MvNormal("eps", mu=np.zeros(2), cov=epsSigma, shape=(len(daily),2))
  1. V的计算中,将for循环替换为theano.scan
def stepping(e,j,z,v0):
    v1 = alphabeta[0] + alphabeta[1] * v0
    v1 = v1 + sig_v * e[1]
    v1 = v1 * pm.math.sqrt(1e-8 + pm.math.maximum(v0, 0)) 
    v1 = v1 + j * z
    return v1

V0 = pm.TruncatedNormal("V0", mu=alphabeta[0], sigma=1, lower=0)
result, updates = theano.scan( fn=stepping,
                sequences=[
                    dict(input=eps, taps=[0]),
                    dict(input=J, taps=[0]),
                    dict(input=Zy, taps=[0])
                ],
                outputs_info=V0 )
V = result
  1. 向量化Y的计算
Y0 = pm.Normal("YO", mu=mu, sigma=0.1)
Y = mu + pm.math.sqrt(1e-8 + pm.math.maximum(V, 0)[:-1]) * eps[1:,1] + J[1:] * Zy[1:]

Y0_obs = pm.Normal("Y0_obs", mu=Y0, sigma=0.05, dims="date", observed=daily_obs[0])
Y_obs = pm.Normal("Y_obs", mu=Y, sigma=0.05, dims="date", observed=daily_obs[1:])

希望这对你也有用

相关问题 更多 >