深度学习数据集提供程序
datazoo的Python项目详细描述
数据动物园
此存储库提供对多个数据集的统一访问。
用法
首先,您必须从datazoo包导入数据提供者:
from datazoo import data_provider
然后,您可以从列表中选择dataset并获取iterable:
# Dataset object
fashionmnist = data_provider(
dataset='fashionmnist', data_dir='data/fashionmnist/', split='test',
download=True, columns=['index', 'image', 'class']
)
print('Dataset length:', len(fashionmnist))
# Iterate over samples
for i in fashionmnist:
print(i)
分类
单标签数据集
Dataset | Name in data provider | Number of classes | Number of samples | Source | Auto downloading |
---|---|---|---|---|---|
MNIST | ^{ | 10 | 60 000 / 10 000 | torchvision | Yes |
Fashion MNIST | ^{ | 10 | 60 000 / 10 000 | torchvision | Yes |
CIFAR-10 | ^{ | 10 | 50 000 / 10 000 | torchvision | Yes |
CIFAR-100 | ^{ | 100 | 50 000 / 10 000 | torchvision | Yes |
Indoor Scene Recognition | ^{ | 67 | 15620 | -- | Yes |
The Street View House Numbers (SVHN) | ^{ | 10 | 73257 digits for training, 26032 digits for testing, and 531131 additional | -- | Yes |
Linnaeus5 | ^{ | 5 classes: berry, bird, dog, flower, other (negative set) | 1200 training images, 400 test images per class | -- | Yes |
COIL-100 | ^{ | 100 (100 objects) | 7200 images | -- | Yes |
许可证
此软件受麻省理工学院许可证的保护。