使用反向传播的较差解决方案

2024-10-04 11:26:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我只是试着用PyTorch做一个实验,从一对给定的图像(原始图像和变换图像)计算仿射变换矩阵。对于这个例子,我只使用一个小的5x5网格,其中一条直线作为原始图像,一条倾斜45度的直线作为变换后的输出。出于某种原因,似乎损失减少了,梯度变得越来越小(很明显)。但它收敛到的解决方案似乎有点离题(完全不像一条直线)。你知道吗

import numpy as np
import matplotlib.pyplot as plt
import torch.optim as optim
import torch
import torch.nn as nn
import torch.nn.functional as F

torch.manual_seed(989)

# source_image = torch.tensor([[0,1,0],[0,1,0],[0,1,0]])
source_image = torch.tensor([[0,0,1,0,0],[0,0,1,0,0],[0,0,1,0,0],[0,0,1,0,0],[0,0,1,0,0]])

plt.imshow(source_image)

# transformed_image = torch.eye(3)
transformed_image = torch.eye(5)

plt.imshow(transformed_image)

source_image = source_image.reshape(1, 1, source_image.shape[0], source_image.shape[1])
transformed_image = transformed_image.reshape(1, 1, transformed_image.shape[0], transformed_image.shape[1])
source_image = source_image.type(torch.FloatTensor)
transformed_image = transformed_image.type(torch.FloatTensor)

class AffineNet(nn.Module):
    def __init__(self):
        super(AffineNet, self).__init__()
        self.M = torch.nn.Parameter(torch.randn(1, 2, 3))
    def forward(self, im):
        flow_grid = F.affine_grid(self.M, transformed_image.size())
        transformed_flow_image = F.grid_sample(transformed_image, flow_grid, padding_mode="border")
        return transformed_flow_image

affineNet = AffineNet()
optimizer = optim.SGD(affineNet.parameters(), lr=0.01)
criterion = nn.MSELoss()

for i in range(1000):
    optimizer.zero_grad()
    output = affineNet(transformed_image)
    loss = criterion(output, source_image)
    loss.backward()
    if(i%10==0):
        print(i, loss.item(), affineNet.M.grad)
    optimizer.step()

print(affineNet.M)

printme = output.detach().reshape(output.shape[2], output.shape[3])
plt.imshow(printme.cpu())

如果你把注释过的行弄得乱七八糟,使用3x3网格而不是5x5网格,它看起来确实工作得很好。有人能帮我理解为什么会这样吗?如果我也和种子一起玩的话,似乎会有很大的不同。你知道吗


Tags: 图像imageimportselfsourceoutputasplt