在python中实现自然梯度下降

y = lambda x : x**2 dy_dx = lambda x : 2*x def gradient_descent(function,derivative,initial_guess): optimum = initial_guess while derivative(optimum) != 0: optimum = optimum - derivative(optimum) else: return optimum gradient_descent(y,dy_dx,5)

import matplotlib.pyplot as plt def stepGradient(x,y, step): b_current = 0 m_current = 0 b_gradient = 0 m_gradient = 0 N = int(len(x)) for i in range(0, N): b_gradient += -(1/N) * (y[i] - ((m_current*x[i]) + b_current)) m_gradient += -(1/N) * x[i] * (y[i] - ((m_current * x[i]) + b_current)) while abs(b_gradient) > 0.01 and abs(m_gradient) > 0.01: b_current = b_current - (step * b_gradient) m_current = m_current - (step * m_gradient) for i in range(0, N): b_gradient += -(1/N) * (y[i] - ((m_current*x[i]) + b_current)) m_gradient += -(1/N) * x[i] * (y[i] - ((m_current * x[i]) + b_current)) return [b_current, m_current] x = [1,2, 2,3,4,5,7,8] y = [1.5,3,1,3,2,5,6,7] step = 0.00001 (b,m) = stepGradient(x,y,step) plt.scatter(x,y) abline_values = [m * i + b for i in x] plt.plot(x, abline_values, 'b') plt.show()

import matplotlib.pyplot as plt def stepGradient(x,y): step = 0.001 b_current = 0 m_current = 0 b_gradient = 0 m_gradient = 0 N = int(len(x)) for i in range(0, N): b_gradient += -(1/N) * (y[i] - ((m_current*x[i]) + b_current)) m_gradient += -(1/N) * x[i] * (y[i] - ((m_current * x[i]) + b_current)) while abs(b_gradient) > 0.01 or abs(m_gradient) > 0.01: b_current = b_current - (step * b_gradient) m_current = m_current - (step * m_gradient) b_gradient= 0 m_gradient = 0 for i in range(0, N): b_gradient += -(1/N) * (y[i] - ((m_current*x[i]) + b_current)) m_gradient += -(1/N) * x[i] * (y[i] - ((m_current * x[i]) + b_current)) return [b_current, m_current] x = [1,2, 2,3,4,5,7,8,10] y = [1.5,3,1,3,2,5,6,7,20] (b,m) = stepGradient(x,y) plt.scatter(x,y) abline_values = [m * i + b for i in x] plt.plot(x, abline_values, 'b') plt.show()

2条回答

网友

1楼 · 编辑于 2024-06-26 13:55:35

您还需要减小步长（梯度下降公式中的gamma）：

y = lambda x : x**2
dy_dx = lambda x : 2*x
def gradient_descent(function,derivative,initial_guess):
    optimum = initial_guess
    while abs(derivative(optimum)) > 0.01:
        optimum = optimum - 0.01*derivative(optimum)
        print((optimum,derivative(optimum)))
    else:
        return optimum

网友

2楼 · 编辑于 2024-06-26 13:55:35

只有当计算出的浮点值等于零时，while循环才会停止。这是天真的，因为浮点值很少被精确地计算出来。相反，当计算值足够接近归零时，停止循环。使用类似的东西

while math.abs(derivative(optimum)) > eps:

其中eps是所需的计算值精度。这可以是另一个参数，可能有一个默认值1e-10或类似的参数。在

也就是说，你的问题更严重。假设你的算法太天真了

^{pr2}$

将使optimum的值更接近实际的最佳值。在您的特定情况下，变量optimum只是在5（您的初始猜测）和{}之间来回循环。注意，5处的导数是10，而{}处的导数是-10。在

所以你要避免骑自行车。你可以将你的δ2*derivative(optimum)乘以小于1的值，这在你的特殊情况下是有效的。但这通常行不通。在

为了安全起见，用较小的值和较大的值来“括号”您的最佳点，然后使用导数来找到下一个猜测。但要确保你的下一个猜测不会超出括号内的区间。如果是这样，或者如果你的猜测收敛太慢，请使用另一种方法，如平分或黄金平均搜索。在

当然，这意味着你的“非常幼稚的梯度下降”算法太幼稚了，不能正常工作。这就是为什么真正的优化程序更复杂。在

相关问题更多 >

编程相关推荐

热门问题

热门文章