我只是试着用PyTorch做一个实验,从一对给定的图像(原始图像和变换图像)计算仿射变换矩阵。对于这个例子,我只使用一个小的5x5网格,其中一条直线作为原始图像,一条倾斜45度的直线作为变换后的输出。出于某种原因,似乎损失减少了,梯度变得越来越小(很明显)。但它收敛到的解决方案似乎有点离题(完全不像一条直线)。你知道吗
import numpy as np
import matplotlib.pyplot as plt
import torch.optim as optim
import torch
import torch.nn as nn
import torch.nn.functional as F
torch.manual_seed(989)
# source_image = torch.tensor([[0,1,0],[0,1,0],[0,1,0]])
source_image = torch.tensor([[0,0,1,0,0],[0,0,1,0,0],[0,0,1,0,0],[0,0,1,0,0],[0,0,1,0,0]])
plt.imshow(source_image)
# transformed_image = torch.eye(3)
transformed_image = torch.eye(5)
plt.imshow(transformed_image)
source_image = source_image.reshape(1, 1, source_image.shape[0], source_image.shape[1])
transformed_image = transformed_image.reshape(1, 1, transformed_image.shape[0], transformed_image.shape[1])
source_image = source_image.type(torch.FloatTensor)
transformed_image = transformed_image.type(torch.FloatTensor)
class AffineNet(nn.Module):
def __init__(self):
super(AffineNet, self).__init__()
self.M = torch.nn.Parameter(torch.randn(1, 2, 3))
def forward(self, im):
flow_grid = F.affine_grid(self.M, transformed_image.size())
transformed_flow_image = F.grid_sample(transformed_image, flow_grid, padding_mode="border")
return transformed_flow_image
affineNet = AffineNet()
optimizer = optim.SGD(affineNet.parameters(), lr=0.01)
criterion = nn.MSELoss()
for i in range(1000):
optimizer.zero_grad()
output = affineNet(transformed_image)
loss = criterion(output, source_image)
loss.backward()
if(i%10==0):
print(i, loss.item(), affineNet.M.grad)
optimizer.step()
print(affineNet.M)
printme = output.detach().reshape(output.shape[2], output.shape[3])
plt.imshow(printme.cpu())
如果你把注释过的行弄得乱七八糟,使用3x3网格而不是5x5网格,它看起来确实工作得很好。有人能帮我理解为什么会这样吗?如果我也和种子一起玩的话,似乎会有很大的不同。你知道吗
目前没有回答
相关问题 更多 >
编程相关推荐