25個大型數據集整理說明
相關數據集來源:https://blog.csdn.net/Mbx8X9u/article/details/79849738
https://www.analyticsvidhya.com/blog/2018/03/comprehensive-collection-deep-learning-datasets/?spm=a2c4e.11153959.blogcont576274.69.16b330274pLaMG
1.MNIST數據集
數據集下載地址:http://yann.lecun.com/exdb/mnist/參考tensorflow說明:http://wiki.jikexueyuan.com/project/tensorflow-zh/tutorials/mnist_download.html
github解壓數據地址:https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/tutorials/mnist
訓練集60k張28x28的照片(其中55k用於訓練,5k用於驗證),測試集10k張28x28的圖片
採用的存儲格式格式如下,採用非inter的存儲模式:
TRAINING SET LABEL FILE (train-labels-idx1-ubyte):
[offset] [type] [value] [description]0000 32 bit integer 0x00000801(2049) magic number (MSB first)
0004 32 bit integer 60000 number of items
0008 unsigned byte ?? label
0009 unsigned byte ?? label
........
xxxx unsigned byte ?? label
The labels values are 0 to 9.
TRAINING SET IMAGE FILE (train-images-idx3-ubyte):
[offset] [type] [value] [description]0000 32 bit integer 0x00000803(2051) magic number
0004 32 bit integer 60000 number of images
0008 32 bit integer 28 number of rows
0012 32 bit integer 28 number of columns
0016 unsigned byte ?? pixel
0017 unsigned byte ?? pixel
........
xxxx unsigned byte ?? pixel
Pixel表示灰度值,0代表白色,255代表黑色
2.COCO數據集
3.ImageNet數據集
訓練集邊框,驗證集邊框。
4.街景房屋號碼SVHN數據集
數據源地址:http://ufldl.stanford.edu/housenumbers/
5.CIFAR-10數據集下載
http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
http://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz