<p>PyTorch最近在{<cd1>}中添加了一个{a1},它提供了{<cd2>}…{<cd3>}来直接计算标量函数{<cd4>}<strong>在{<cd5>}指定的</strong>位置处的{<cd4><strong>参数的hessian值,这是一个对应于{<cd4>参数的张量元组<code>hessian</code>我相信它本身不支持自动区分</p>
<p>然而,请注意,截至2021年3月,它仍处于测试阶段</p>
<hr/>
<p>使用<code>torch.autograd.functional.hessian</code>为非零均值创建<a href="https://en.wikipedia.org/wiki/Score_test" rel="nofollow noreferrer">score-test</a>的完整示例(作为单样本t检验的(坏)替代):</p>
<pre class="lang-py prettyprint-override"><code>import numpy as np
import torch, torchvision
from torch.autograd import Variable, grad
import torch.distributions as td
import math
from torch.optim import Adam
import scipy.stats
x_data = torch.randn(100)+0.0 # observed data (here sampled under H0)
N = x_data.shape[0] # number of observations
mu_null = torch.zeros(1)
sigma_null_hat = Variable(torch.ones(1), requires_grad=True)
def log_lik(mu, sigma):
return td.Normal(loc=mu, scale=sigma).log_prob(x_data).sum()
# Find theta_null_hat by some gradient descent algorithm (in this case an closed-form expression would be trivial to obtain (see below)):
opt = Adam([sigma_null_hat], lr=0.01)
for epoch in range(2000):
opt.zero_grad() # reset gradient accumulator or optimizer
loss = - log_lik(mu_null, sigma_null_hat) # compute log likelihood with current value of sigma_null_hat (= Forward pass)
loss.backward() # compute gradients (= Backward pass)
opt.step() # update sigma_null_hat
print(f'parameter fitted under null: sigma: {sigma_null_hat}, expected: {torch.sqrt((x_data**2).mean())}')
theta_null_hat = (mu_null, sigma_null_hat)
U = torch.tensor(torch.autograd.functional.jacobian(log_lik, theta_null_hat)) # Jacobian (= vector of partial derivatives of log likelihood w.r.t. the parameters (of the full/alternative model)) = score
I = -torch.tensor(torch.autograd.functional.hessian(log_lik, theta_null_hat)) / N # estimate of the Fisher information matrix
S = torch.t(U) @ torch.inverse(I) @ U / N # test statistic, often named "LM" (as in Lagrange multiplier), would be zero at the maximum likelihood estimate
pval_score_test = 1 - scipy.stats.chi2(df = 1).cdf(S) # S asymptocially follows a chi^2 distribution with degrees of freedom equal to the number of parameters fixed under H0
print(f'p-value Chi^2-based score test: {pval_score_test}')
#parameter fitted under null: sigma: tensor([0.9260], requires_grad=True), expected: 0.9259940385818481
# comparison with Student's t-test:
pval_t_test = scipy.stats.ttest_1samp(x_data, popmean = 0).pvalue
#> p-value Chi^2-based score test: 0.9203232752568568
print(f'p-value Student\'s t-test: {pval_t_test}')
#> p-value Student's t-test: 0.9209265268946605
</code></pre>