MongoDB 簡介

MongoDB是一個介於關係數據庫和非關係數據庫之間的產品,是非關係數據庫當中功能最豐富,最像關係數據庫的。他支持的數據結構非常鬆散,是 類似json的bson格式,因此可以存儲比較複雜的數據類型。Mongo最大的特點是他支持的查詢語言非常強大,其語法有點類似於面向對象的查詢語 言,幾乎可以實現類似關係數據庫單表查詢的絕大部分功能,而且還支持對數據建立索引。



基本術語

Document -> collections -> dbs

Collection 切分 chunk -> shard.

Shard 意義:scalabilityand load balance.

MongoDB按shard key,把collection切割成若干chunks。每個 chunk 的數據結構,是一個三元組,{collection,minKey,maxKey}

Replica Set -> availability

Config servers用於存儲MongoDB集羣的元數據metadata,這些元數據包括如下兩個部分,每一個shardserver包括哪些chunks,每個chunk存儲了哪些collections 的哪些 documents


Shards &&replica 命令集錦

$ mongod --shardsvr --replSet  shard-a  --dbpath  data/rs-a-1 --port 30000 --logpath   data/rs-a-1.log  --fork –nojournal
$ mongod --configsvr --dbpath data/config-3 --port 27021--logpath data/config-3.log --fork –nojournal
$ mongos --configdb ubuntu:27019,ubuntu:27020,ubuntu:27021--logpath data/mongos.log --fork --port 40000
$ mongo Ubuntu:30000
>rs.initiate()
>rs.add(“Ubuntu:30000”)
>rs.add(“Ubuntu:30001”, {arbiterOnly: true})
>rs.conf()
>rs.status()

Configuration the cluster

>sh.addShard(“shard-a/Ubuntu:30000,Ubuntu:30001”)
>sh.addShard(“shard-b/Ubuntu:30100,Ubuntu:30101”)
>db.getSiblingDB(“config”).shards.find()

Shard collections

>sh.enableSharding(“adv_pms_development”)
>db.getSiblingDB(“config”).databases.find()
>sh.shardCollection(“adv_pms_development.tasks”, {owner:1, _id: 1})
>db.getSiblingDB(“config”).collections.find()
$mongo Ubuntu:40000
>sh.status()
>use config

Export and importdata

$mongodump -h ubuntu --port 27017 -d adv_pms_development -c tasks
$mongorestore -h ubuntu --port 40000 -d adv_pms_development -c shardtasks dump/adv_pms_development/tasks.bson

兩個特殊文件格式GridFS Capped Collections

GridFS isa specification for storing and retrieving files that exceed the BSON-document sizelimit of 16MB.

Instead of storing a file in a single document, GridFSdivides a file into parts, or chunks, and stores each of those chunks as aseparate document. By default GridFS limits chunk size to 256k. GridFS uses twocollections to store files. One collection stores the file chunks, and theother stores file metadata.

When you query a GridFS store for a file, the driver orclient will reassemble the chunks as needed. You can perform range queries onfiles stored through GridFS. You also can access information from arbitrarysections of files, which allows you to “skip” into the middle of a video oraudio file.

When to use it?

In some situations, storing large files may be moreefficient in a MongoDB database than on a system-level filesystem.

  • If your filesystem limits the number of files in a directory, you can use GridFS to store as many files as needed.
  • When you want to keep your files and metadata automatically synced and deployed across a number of systems and facilities.
  • When you want to access information from portions of large files without having to load whole files into memory, you can use GridFS to recall sections of files without reading the entire file into memory.

Do not use GridFS if you need to update the content of theentire file atomically. As an alternative you can store multiple versions ofeach file and specify the current version of the file in the metadata. You canupdate the metadata field that indicates “latest” status in an atomic updateafter uploading the new version of the file, and later remove previous versionsif needed.

Use

$ mongofiles --host 10.175.31.248 --port 30000 -dadv_pms_development list
$ mongofiles --host 10.175.31.248 --port 30000 -dadv_pms_development put/get/delete/search me.jpg

Note

 For replicasetsmongofiles can only read from the set’s ‘primary node


Capped Collections

Cappedcollections are fixed-size collections that supporthigh-throughput operations that insert, retrieve, and delete documents based oninsertion order. Capped collections work in a way similar to circular buffers:once a collection fills its allocated space, it makes room for new documents byoverwriting the oldest documents in the collection.

Capped collections have the following behaviors:

  • Capped collections guarantee preservation of the insertion order. As a result, queries do not need an index to return documents in insertion order. Without this indexing overhead, they can support higher insertion throughput.
  • Capped collections guarantee that insertion order is identical to the order on disk (natural order) and do so by prohibiting updates that increase document size. Capped collections only allow updates that fit the original document size, which ensures a document does not change its location on disk.
  • Capped collections automatically remove the oldest documents in the collection without requiring scripts or explicit remove operations.

Recommendations and Restrictions

  • You cannot shard a capped collection.
  • Capped collections created after 2.2 have an _id field and an index on the _id field by default.
  • You can update documents in a collection after inserting them; however, these updates cannot cause the documents to grow. If the update operation causes the document to grow beyond their original size, the update operation will fail.

If you plan to update documents in a capped collection,remember to create an index to prevent update operations that require a tablescan.

  • You cannot delete documents from a capped collection. To remove all records from a capped collection, use the ‘emptycapped’ command. To remove the collection entirely, use the drop() method.
  • Use natural ordering to retrieve the most recently inserted elements from the collection efficiently. This is (somewhat) analogous to tail on a log file.

Procedures

Create Capped Collections

Ø db.createCollection("mycoll", {capped:true,size:100000})

Query a capped collection

Ø db.cappedCollection.find().sort( { $natural: -1} )

Check if a collection is capped

Ø db.collection.isCapped()

Convert a collection to capped

Ø db.runCommand({"convertToCapped":"mycoll", size: 100000});

Ref:

NoSQL數據庫筆談

MongoDB IN ACTION


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章