Index

Basics

 Formally speaking, these indexes are implemented as "B-Tree" indexes.

In the shell, you can create an index by calling the ensureIndex() function, and providing a document that specifies one or 

db.things.ensureIndex({j:1});

Once a collection is indexed on a key, random access on query expressions which match the specified key are fast. 

db.things.find({j:2});  // fast - uses index (有個B樹對j建立了索引 )
db.things.find({x:3});  // slow - has to check all because 'x' isn't indexed

You can run

db.things.getIndexes()

in the shell to see the existing indexes on the collection. Run

db.system.indexes.find()

to see all indexes for the database.

 A standard index build will block all other database operations. we suggest that you build it in the background using the background : true option. This will ensure that your database remains responsive even while the index is being built.(由後臺線程執行)

In many cases, not having an index at all can impact performance almost as much as the index build itself. If this is the case, we recommend the application code check for the index at startup using the chosen mongodb driver's getIndex() function and terminate if the index cannot be found. A separate indexing script can then be explicitly invoked when safe to do so.

Creation Options

The second argument for ensureIndex is a document/object representing the options. 

option values default
background true/false false
dropDups true/false false
unique true/false false
sparse true/false false
v index version. 0 = pre-v2.0, 1 = smaller/faster (current) 1 in v2.0. Default is used except in unusual situations.

name is also an option  will be deprecated in the future. 

The _id Index

For all collections except capped collections, an index is automatically created for the _id field.This index is special and cannot be deleted. The _id index enforces uniqueness for its keys (except for some situations with sharding).

_id values are invariant.

Indexing on Embedded Keys ("Dot Notation")

With MongoDB you can even index on a key inside of an embedded document. Reaching into sub-documents is referred to as Dot Notation. For example:

db.things.ensureIndex({"address.city": 1})

Documents as Keys

Indexed fields may be of any type, including (embedded) documents:

db.factories.insert( { name: "xyz", metro: { city: "New York", state: "NY" } } );
db.factories.ensureIndex( { metro : 1 } );
// this query can use the above index:
db.factories.find( { metro: { city: "New York", state: "NY" } } );
// this one too, as {city:"New York"} < {city:"New York",state:"NY"}
db.factories.find( { metro: { $gte : { city: "New York" } } } );//文檔的比較
// this query does not match the document because the order of fields is significant//僅指在這樣建立索引後
compare order is predefined and is ascending key order in the order the keys occur in the BSON document. 
db.factories.find( { metro: { state: "NY" , city: "New York" } } );

An alternative to documents as keys is to create a compound index:

db.factories.ensureIndex( { "metro.city" : 1, "metro.state" : 1 } );
// these queries can use the above index:
db.factories.find( { "metro.city" : "New York", "metro.state" : "NY" } );
db.factories.find( { "metro.city" : "New York" } );
db.factories.find().sort( { "metro.city" : 1, "metro.state" : 1 } );
db.factories.find().sort( { "metro.city" : 1 } )

Compound Keys Indexes (組合Keys)

In addition to single-key basic indexes, MongoDB also supports multi-key "compound" indexes. Just like basic indexes, you use theensureIndex() function in the shell to create the index, but instead of specifying only a single key, you can specify several :

db.things.ensureIndex({j:1, name:-1});

When creating an index, the number associated with a key specifies the direction of the index, so it should always be 1 (ascending) or -1 (descending). 

If you have a compound index on multiple fields, you can use it to query on the beginning subset of fields. So if you have an index on

a,b,c

you can use it query on

a
a,b
a,b,c
New in 1.6+
If the first key of the index is present in the query, that index may be selected by the query optimizer. If the first key is not present in the query, the index will only be used if hinted explicitly. While indexes can be used in many cases where an arbitrary subset of indexed fields are present in the query, as a general rule the optimal indexes for a given query are those in which queried fields precede any non queried fields.???


Indexing Array Elements

When a document's stored value for a index key field is an array, MongoDB indexes each element of the array. See the Multikeys page for more information.

數組到底是什麼?

For example the following "document" can be stored in Mongo DB:

{ author: 'joe',
  created : new Date('03/28/2009'),
  title : 'Yet another blog post',
  text : 'Here is the text...',
  tags : [ 'example', 'joe' ],
  comments : [ { author: 'jim', comment: 'I disagree' },
              { author: 'nancy', comment: 'Good post' }
  ]
}

 For example the query

> db.posts.find( { "comments.author" : "jim" } )

is possible and means "find any blog post where at least one comment subjobject has author == 'jim'".



Sparse Indexes

A "sparse index" is an index that only includes documents with the indexed field.
Any document that is missing the sparsely indexed field will not be stored in the index; the index will therefor be sparse because of the missing documents when values are missing.

Sparse indexes, by definition, are not complete (for the collection) and behave differently than complete indexes. When using a "sparse index" for sorting (or in some cases just filtering) some documents in the collection may not be returned. This is because only documents in the index will be returned.

> db.people.ensureIndex({title : 1}, {sparse : true})
> db.people.save({name:"Jim"})
> db.people.save({name:"Sarah", title:"Princess"})
> db.people.find()
{ "_id" : ObjectId("4de6abd5da558a49fc5eef29"), "name" : "Jim" }
{ "_id" : ObjectId("4de6abdbda558a49fc5eef2a"), "name" : "Sarah", "title" : "Princess" }
> db.people.find().sort({title:1}) // only 1 doc returned because sparse
{ "_id" : ObjectId("4de6abdbda558a49fc5eef2a"), "name" : "Sarah", "title" : "Princess" }
> db.people.dropIndex({title : 1})
{ "nIndexesWas" : 2, "ok" : 1 }
> db.people.find().sort({title:1}) // no more index, returns all documents
{ "_id" : ObjectId("4de6abd5da558a49fc5eef29"), "name" : "Jim" }
{ "_id" : ObjectId("4de6abdbda558a49fc5eef2a"), "name" : "Sarah", "title" : "Princess" }

You can combine sparse with unique to produce a unique constraint that ignores documents with missing fields.

Note that MongoDB's sparse indexes are not block-level indexes. MongoDB sparse indexes can be thought of as dense indexes with a specific filter.

Unique Indexes

MongoDB supports unique indexes, which guarantee that no documents are inserted whose values for the indexed keys match those of an existing document. To create an index that guarantees that no two documents have the same values for both firstname andlastname you would do:

db.things.ensureIndex({firstname: 1, lastname: 1}, {unique: true});
Unique Indexes and Missing Keys

When a document is saved to a collection any missing indexed keys will be inserted with null values in the index entry. Thus, it won't be possible to insert multiple documents missing the same indexed key in a unique index.

  db.things.ensureIndex({firstname: 1}, {unique: true});
  db.things.save({lastname: "Smith"});
  // Next operation will fail because of the unique index on firstname.
  db.things.save({lastname: "Jones"});
dropDups

A unique index cannot be created on a key that has pre-existing duplicate values. If you would like to create the index anyway, keeping the first document the database indexes and deleting all subsequent documents that have duplicate values, add the dropDups option.

db.things.ensureIndex({firstname : 1}, {unique : true, dropDups : true})
dropDups deletes data. A "fat finger" with drop dups could delete almost all data from a collection. Backup before using. Note also that if the field is missing in multiple records, that evaluates to null, and those would then be consider duplicates – in that case using sparse, or not using dropDups, would be very important.

ReIndex

The reIndex command will rebuild all indexes for a collection.

db.myCollection.reIndex()

See here for more documentation: reIndex Command

Additional Notes on Indexes

  • MongoDB indexes (and string equality tests in general) are case sensitive.
  • When you update an object, if the object fits in its previous allocation area, only those indexes whose keys have changed are updated. This improves performance. Note that if the object has grown and must move, all index keys must then update, which is slower.
  • Index information is kept in the system.indexes collection, run db.system.indexes.find() to see example data.
Keys Too Large To Index

Index entries have a limitation on their maximum size (the sum of the values), currently approximately 800 bytes. Documents which fields have values (key size in index terminology) greater than this size can not be indexed. You will see log messages similar to:

...Btree::insert: key too large to index, skipping...

Queries against this index will not return the unindexed documents. You can force a query to use another index, or really no index, using this special index hint:

db.myCollection.find({<key>: <value too large to index>}).hint({$natural: 1})

This will cause the document to be used for comparison of that field (or fields), rather than the index.

This limitation will eventually be removed (see SERVER-3372 ).
Index Performance

Indexes make retrieval by a key, including ordered sequential retrieval, very fast. Updates by key are faster too as MongoDB can find the document to update very quickly.

However, keep in mind that each index created adds a certain amount of overhead for inserts and deletes. In addition to writing data to the base collection,keys must then be added to the B-Tree indexes. Thus, indexes are best for collections where the number of reads is much greater than the number of writes. 

Using sort() without an Index

You may use sort() to return data in order without an index if the data set to be returned is small(less than four megabytes). For these cases it is best to use limit() and sort() together.


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章