期望4D张量作为输入,而得到2D张量

2024-10-01 17:29:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试使用Pythorch上的预训练网络VGG16构建一个神经网络。在

我知道我需要调整网络的分类器部分,所以我已经冻结了参数,以防止反向传播。在

代码:

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import matplotlib.pyplot as plt
import numpy as np
import time

import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torch.autograd import Variable
from torchvision import datasets, transforms
import torchvision.models as models
from collections import OrderedDict

data_dir = 'flowers'
train_dir = data_dir + '/train'
valid_dir = data_dir + '/valid'
test_dir = data_dir + '/test'


train_transforms = transforms.Compose([transforms.Resize(224),
                                       transforms.RandomRotation(30),
                                       transforms.RandomResizedCrop(224),
                                       transforms.RandomHorizontalFlip(),
                                       transforms.ToTensor(),
                                       transforms.Normalize(mean=[0.485, 0.456, 0.406], 
                                                            std=[0.229, 0.224, 0.225])])



validn_transforms = transforms.Compose([transforms.Resize(224),
                                        transforms.CenterCrop(224),
                                        transforms.ToTensor(),
                                        transforms.Normalize((0.485, 0.456, 0.406), 
                                                            (0.229, 0.224, 0.225))])

test_transforms = transforms.Compose([ transforms.Resize(224),
                                       transforms.RandomResizedCrop(224),
                                       transforms.ToTensor(),
                                       transforms.Normalize((0.485, 0.456, 0.406), 
                                                            (0.229, 0.224, 0.225))])


train_data = datasets.ImageFolder(train_dir,
                                transform=train_transforms)

validn_data = datasets.ImageFolder(valid_dir,
                                transform=validn_transforms)

test_data = datasets.ImageFolder(test_dir,
                                transform=test_transforms)



trainloader = torch.utils.data.DataLoader(train_data, batch_size=32, shuffle=True)
validnloader = torch.utils.data.DataLoader(validn_data, batch_size=32, shuffle=True)
testloader = torch.utils.data.DataLoader(test_data, batch_size=32, shuffle=True)


model = models.vgg16(pretrained=True)
model


for param in model.parameters():
    param.requires_grad = False

classifier = nn.Sequential(OrderedDict([            
                          ('fc1', nn.Linear(3*224*224, 10000)), 
                          ('relu', nn.ReLU()),
                          ('fc2', nn.Linear(10000, 5000)),
                          ('relu', nn.ReLU()),
                          ('fc3', nn.Linear(5000, 102)),
                          ('output', nn.LogSoftmax(dim=1))
                          ]))

model.classifier = classifier

classifier


criterion = nn.NLLLoss()
optimizer = optim.Adam(model.classifier.parameters(), lr=0.001)
model.cuda()

epochs = 1
steps = 0
training_loss = 0
print_every = 300
for e in range(epochs):
    model.train()
    for images, labels in iter(trainloader):
        steps == 1

        images.resize_(32,3*224*224)

        inputs = Variable(images.cuda())
        targets = Variable(labels.cuda())
        optimizer.zero_grad()

        output = model.forward(inputs)
        loss = criterion(output, targets)
        loss.backward()
        optimizer.step()

        training_loss += loss.data[0]

        if steps % print_every == 0:
            print("Epoch: {}/{}... ".format(e+1, epochs),
                  "Loss: {:.4f}".format(training_loss/print_every))

            running_loss = 0

Traceback

^{pr2}$

可能是因为我在我的定义中使用了线性运算?在


Tags: fromtestimportdatamodelasdirtrain
1条回答
网友
1楼 · 发布于 2024-10-01 17:29:22

你的网络有两个问题-

  1. 您创建了自己的分类器,其第一层接受大小为(3*224*224)的输入,但这不是vgg16的特性部分的输出大小。特征输出张量的大小(25088) enter image description here

  2. 您正在调整输入的大小,使其成为形状(3*224*224)(对于每个批次)的张量,但是vgg16的features部分需要一个(3, 224, 224)的输入。您的自定义分类器在特性之后,所以您需要为特性而不是分类器准备输入。

解决方案

要解决第一个问题,您需要将分类器的定义更改为-

classifier = nn.Sequential(OrderedDict([            
                          ('fc1', nn.Linear(25088, 10000)), 
                          ('relu', nn.ReLU()),
                          ('fc2', nn.Linear(10000, 5000)),
                          ('relu', nn.ReLU()),
                          ('fc3', nn.Linear(5000, 102)),
                          ('output', nn.LogSoftmax(dim=1))
                          ]))

要解决第二个问题,请将images.resize_(32,3*224*224)更改为images.resize_(32, 3, 224, 224)。在

另请注意,您的分类器第一层输出的10000个单元非常大。你应该试着像在原始分类器中那样保持在4000左右(如果你只对第一层使用原始权重,那就更好了,因为随着时间的推移,那些已经被证明是很好的特性)

相关问题 更多 >

    热门问题