如何替换infs避免PyTorch中的nan梯度

网友

1楼 · 编辑于 2024-10-04 01:29:00

我发现的一个解决方法是用反向对应的方法手动实现Log1PlusExp函数。但这并不能解释torch.where在问题中的不良行为。在

>>> class Log1PlusExp(torch.autograd.Function):
...     """Implementation of x ↦ log(1 + exp(x))."""
...     @staticmethod
...     def forward(ctx, x):
...         exp = x.exp()
...         ctx.save_for_backward(x)
...         return x.where(torch.isinf(exp), exp.log1p())
...     @staticmethod
...     def backward(ctx, grad_output):
...         x, = ctx.saved_tensors
...         return grad_output / (1 + (-x).exp())
... 
>>> log_1_plus_exp = Log1PlusExp.apply
>>> x = torch.tensor([0., 1., 100.], requires_grad=True)
>>> log_1_plus_exp(x)  # No infs
tensor([  0.6931,   1.3133, 100.0000], grad_fn=<Log1PlusExpBackward>)
>>> log_1_plus_exp(x).sum().backward()
>>> x.grad  # And no nans!
tensor([0.5000, 0.7311, 1.0000])

网友

2楼 · 编辑于 2024-10-04 01:29:00

如果x>；=20，则函数输出约为x。使用Pythorch方法torch.softplus公司. 这有助于解决问题。在

网友

3楼 · 编辑于 2024-10-04 01:29:00

But for too large x, it outputs inf because of the exponentiation

这就是为什么x永远不要太大。理想情况下应在[-1，1]范围内。如果不是这样，您应该规范化您的输入。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何替换infs避免PyTorch中的nan梯度

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >