Introduction
It works by splitting large object into small chunks, usually 256k in size.
(把一個文件切分成小塊兒存在mongo的collection裏)
Specification
Storage Collections
GridFS uses two collections to store data:
-
files contains the object metadata
-
chunks contains the binary chunks with some additional accounting information
the files and chunks collections are named with a prefix. (prefix相當於邏輯的文件系統)By default the prefix is fs.,
Here's an example of the standard GridFS interface in Java:
/*
* default root collection usage - must be supported
*/
GridFS myFS = new GridFS(myDatabase); myFS.storeFile(new File("/tmp/largething.mpg"));
/*
* specified root collection usage - optional
*/
GridFS myContracts = new GridFS(myDatabase, "contracts"); myFS.retrieveFile("smithco", new File("/tmp/smithco_20090105.pdf"));
files
Documents in the files collection require the following fields: 一個文件的metadata
{
"_id" : <unspecified>, "length" : data_number, "chunkSize" : data_number, "uploadDate" : data_date, "md5" : data_string }
chunks
The structure of documents from the chunks collection is as follows:
{
"_id" : <unspecified>, "files_id" : <unspecified>, "n" : chunk_number, "data" : data_binary, }
Indexes
GridFS implementations should create a unique, compound index in the chunks collection for files_id and n.
Here's how you'd do that from the shell:
db.fs.chunks.ensureIndex({files_id:1, n:1}, {unique: true});
This way, a chunk can be retrieved efficiently using it's files_id and n values. Note
that GridFS implementations should use findOne operations to get chunks individually, and should not leave open a cursor to query for all chunks. So to get the first
chunk, we could do:
db.fs.chunks.findOne({files_id: myFileID, n: 0});