在Pythorch中加载自定义数据集

问题2

另外，如何访问数据？在

我试过了

train_set.access(2560,0)

以及

with train_set, test_set: x, y = train_set.access(2560,0)

他们要么给我一个错误信息

KeyError Traceback (most recent call last) in ----> 1 train_set.access(2560,0)

/workspace/raven_data/AMT/MusicNet/pytorch_musicnet/musicnet.py in access(self, rec_id, s, shift, jitter) 106 107 if self.mmap: --> 108 x = np.frombuffer(self.records[rec_id][0][ssz_float:int(s+scaleself.window)*sz_float], dtype=np.float32).copy() 109 else: 110 fid,_ = self.records[rec_id]

KeyError: 2560

或者给我一个空的x和{}

1条回答

网友

1楼 · 发布于 2024-10-03 02:40:09

Question 1
I don't understand why the code doesn't work without the line with train_set, test_set.

为了能够将^{}与自定义数据集设计一起使用，您必须创建一个属于^{}子类的数据集类（并实现特定函数）并将其传递给dataloader，即使他们这样说：

All other datasets should subclass it. All subclasses should override __len__, that provides the size of the dataset, and __getitem__, supporting integer indexing in range from 0 to len(self) exclusive.

这是发生在：

train_set = musicnet.MusicNet(root=root, train=True, download=True, window=window)#, pitch_shift=5, jitter=.1)

test_set = musicnet.MusicNet(root=root, train=False, window=window, epoch_size=50000)

train_loader = torch.utils.data.DataLoader(dataset=train_set,batch_size=batch_size,**kwargs)
test_loader = torch.utils.data.DataLoader(dataset=test_set,batch_size=batch_size,**k

如果你检查他们的^{}，你会发现他们是这样做的。在

Question 2
Also, how do I access the data?

有几种可能的方法：

要从数据集中只获取批处理，可以执行以下操作：

^{pr2}$

要访问整个数据集（尤其是在您的示例中）：

dataset = train_loader.dataset.records

（.records是可能因数据集而异的部分，我说.records，因为这是我在{a4}中发现的）

问题1

问题2

相关问题更多 >

编程相关推荐

热门问题

热门文章