Yann LeCun教授的MNIST一共有四個文件:
1.train-image
2.train-label
3.test-image
4.test-label
他們的格式分別如下
1.train_image:
16B頭部的描述,共有60000個圖像樣本,每個樣本爲28*28大小,即60000*28*28+16。
MATLAB(Octave)代碼如下:
fp =fopen('train-images-idx3-ubyte','r');
%magic=fread(fp,4); 沒有轉爲int32讀4B
magic=fread(fp, 1,'int32', 0, 'ieee-be');
size=fread(fp, 1, 'int32',0, 'ieee-be');rows=fread(fp, 1, 'int32',0, 'ieee-be');
cols=fread(fp, 1, 'int32',0, 'ieee-be');
image1=fread(fp,[28,28]);
imshow(image1);fclose(fp);
讀取出來的手寫數字如圖:
2.train-label:
8B頭部描述+60000x1Blabel
fp = fopen(filename,'rb');
magic = fread(fp, 1,'int32', 0, 'ieee-be');
numLabels = fread(fp, 1,'int32', 0, 'ieee-be');
labels =fread(fp,1);
fclose(fp);
3.test-image與label與上面一樣只是數量不同