使用scipy.optimize.curve_fit和权重

n = 200 x = np.linspace(1, 20, n) x0, A, alpha = 12, 3, 3 def f(x, x0, A, alpha): return A * np.exp(-((x-x0)/alpha)**2) noise_sigma = x/20 noise = np.random.randn(n) * noise_sigma yexact = f(x, x0, A, alpha) y = yexact + noise

In [249]: pcov Out[249]: array([[ 1.10205238e-02, -3.91494024e-08, 8.81822412e-08], [ -3.91494024e-08, 1.52660426e-02, -1.05907265e-02], [ 8.81822412e-08, -1.05907265e-02, 2.20414887e-02]]) In [250]: pcov2 Out[250]: array([[ 0.26584674, -0.01836064, -0.17867193], [-0.01836064, 0.27833 , -0.1459469 ], [-0.17867193, -0.1459469 , 0.38659059]])

1条回答

网友

1楼 · 发布于 2024-06-28 20:18:41

至少在scipy版本1.1.0中，参数sigma应该等于每个参数上的错误。特别是documentation说：

A 1-d sigma should contain values of standard deviations of errors in ydata. In this case, the optimized function is chisq = sum((r / sigma) ** 2).

你的情况是：

curve_fit(f, x, y, p0, sigma=noise_sigma, absolute_sigma=True)

我查看了source代码，并验证了当您以这种方式指定sigma时，它会最小化((f-data)/sigma)**2。

顺便说一句，这个通常是当你知道错误的时候你想要最小化的。给定模型，观测点data的可能性由下式给出：

L(data|x0,A,alpha) = product over i Gaus(data_i, mean=f(x_i,x0,A,alpha), sigma=sigma_i)

如果取负对数，则变为（直到不依赖于参数的常数因子）：

-log(L) = sum over i (f(x_i,x0,A,alpha)-data_i)**2/(sigma_i**2)

那只是广场。

我编写了一个测试程序来验证curve_fit是否确实返回了正确的值，并正确地指定了sigma：

from __future__ import print_function
import numpy as np
from scipy.optimize import curve_fit, fmin

np.random.seed(0)

def make_chi2(x, data, sigma):
    def chi2(args):
        x0, A, alpha = args
        return np.sum(((f(x,x0,A,alpha)-data)/sigma)**2)
    return chi2

n = 200
x = np.linspace(1, 20, n)
x0, A, alpha = 12, 3, 3

def f(x, x0, A, alpha):
    return A * np.exp(-((x-x0)/alpha)**2)

noise_sigma = x/20
noise = np.random.randn(n) * noise_sigma
yexact = f(x, x0, A, alpha)
y = yexact + noise

p0 = 10, 4, 2

# curve_fit without parameters (sigma is implicitly equal to one)
popt, pcov = curve_fit(f, x, y, p0)
# curve_fit with wrong sigma specified
popt2, pcov2 = curve_fit(f, x, y, p0, sigma=1/noise_sigma**2, absolute_sigma=True)
# curve_fit with correct sigma
popt3, pcov3 = curve_fit(f, x, y, p0, sigma=noise_sigma, absolute_sigma=True)

chi2 = make_chi2(x,y,noise_sigma)

# double checking that we get the correct answer
xopt = fmin(chi2,p0,xtol=1e-10,ftol=1e-10)

print("popt  = %s, chi2 = %.2f" % (popt,chi2(popt)))
print("popt2 = %s, chi2 = %.2f" % (popt2, chi2(popt2)))
print("popt3 = %s, chi2 = %.2f" % (popt3, chi2(popt3)))
print("xopt  = %s, chi2 = %.2f" % (xopt, chi2(xopt)))

哪些输出：

popt  = [ 11.93617403   3.30528488   2.86314641], chi2 = 200.66
popt2 = [ 11.94169083   3.30372955   2.86207253], chi2 = 200.64
popt3 = [ 11.93128545   3.333727     2.81403324], chi2 = 200.44
xopt  = [ 11.93128603   3.33373094   2.81402741], chi2 = 200.44

正如您所看到的，当您将sigma=sigma指定为曲线拟合的参数时，chi2确实被正确地最小化了。

至于为什么改进不是“更好”，我不太确定。我唯一的猜测是，在没有指定sigma值的情况下，你隐式地假设它们是相等的，并且在拟合重要的部分（峰值）上，误差是“近似”相等的。

为了回答您的第二个问题，nosigma选项不仅用于更改协方差矩阵的输出，它实际上还更改了最小化的内容。

相关问题更多 >

编程相关推荐

热门问题

热门文章