MongoDB是一個NoSQL數據庫系統:一個數據庫可以包含多個集合(Collection),每個集合對應於關係數據庫中的表;而每個集合中可以存儲一組由列標識的記錄,列是可以自由定義的,非常靈活,由一組列標識的實體的集合對應於關係數據庫表中的行。下面通過熟悉MongoDB的基本管理命令,來了解MongoDB提供的DBMS的基本功能和行爲。
MongoDB命令幫助系統
在安裝MongoDB後,啓動服務器進程(mongod),可以通過在客戶端命令mongo實現對MongoDB的管理和監控。看一下MongoDB的命令幫助系統:
root@dev2:~# mongo
MongoDB shell version: 1.8.3
connecting to: test
> help
db.help() help on db methods
db.mycoll.help() help on collection methods
rs.help() help on replica set methods
help connect connecting to a db help
help admin administrative help
help misc misc things to know
help mr mapreduce help
show dbs show database names
show collections show collections in current database
show users show users in current database
show profile show most recent system.profile entries with time >= 1ms
use <db_name> set current database
db.foo.find() list objects in collection foo
db.foo.find( { a : 1 } ) list objects in foo where a == 1
it result of the last line evaluated; use to further iterate
DBQuery.shellBatchSize = x set default number of items to display on shell
exit quit the mongo shell
這是MongoDB最頂層的命令列表,主要告訴我們管理數據庫相關的一些抽象的範疇:數據庫操作幫助、集合操作幫助、管理幫助。如果你想了解數據庫操作更詳細的幫助命令,可以直接使用db.help(),如下所示:
> db.help()
DB methods:
db.addUser(username, password[, readOnly=false])
db.auth(username, password)
db.cloneDatabase(fromhost)
db.commandHelp(name) returns the help for the command
db.copyDatabase(fromdb, todb, fromhost)
db.createCollection(name, { size : ..., capped : ..., max : ... } )
db.currentOp() displays the current operation in the db
db.dropDatabase()
db.eval(func, args) run code server-side
db.getCollection(cname) same as db['cname'] or db.cname
db.getCollectionNames()
db.getLastError() - just returns the err msg string
db.getLastErrorObj() - return full status object
db.getMongo() get the server connection object
db.getMongo().setSlaveOk() allow this connection to read from the nonmaster member of a replica pair
db.getName()
db.getPrevError()
db.getProfilingLevel() - deprecated
db.getProfilingStatus() - returns if profiling is on and slow threshold
db.getReplicationInfo()
db.getSiblingDB(name) get the db at the same server as this one
db.isMaster() check replica primary status
db.killOp(opid) kills the current operation in the db
db.listCommands() lists all the db commands
db.printCollectionStats()
db.printReplicationInfo()
db.printSlaveReplicationInfo()
db.printShardingStatus()
db.removeUser(username)
db.repairDatabase()
db.resetError()
db.runCommand(cmdObj) run a database command. if cmdObj is a string, turns it into { cmdObj : 1 }
db.serverStatus()
db.setProfilingLevel(level,<slowms>) 0=off 1=slow 2=all
db.shutdownServer()
db.stats()
db.version() current version of the server
db.getMongo().setSlaveOk() allow queries on a replication slave server
對數據庫進行管理和操作的基本命令,可以從上面獲取到。如果想要得到更多,而且每個命令的詳細用法,可以使用上面列出的db.listCommands()查詢。
另一個比較基礎的是對指定數據庫的集合進行操作、管理和監控,可以通過查詢db.mycoll.help()獲取到:
> db.mycoll.help()
DBCollection help
db.mycoll.find().help() - show DBCursor help
db.mycoll.count()
db.mycoll.dataSize()
db.mycoll.distinct( key ) - eg. db.mycoll.distinct( 'x' )
db.mycoll.drop() drop the collection
db.mycoll.dropIndex(name)
db.mycoll.dropIndexes()
db.mycoll.ensureIndex(keypattern[,options]) - options is an object with these possible fields: name, unique, dropDups
db.mycoll.reIndex()
db.mycoll.find([query],[fields]) - query is an optional query filter. fields is optional set of fields to return.
e.g. db.mycoll.find( {x:77} , {name:1, x:1} )
db.mycoll.find(...).count()
db.mycoll.find(...).limit(n)
db.mycoll.find(...).skip(n)
db.mycoll.find(...).sort(...)
db.mycoll.findOne([query])
db.mycoll.findAndModify( { update : ... , remove : bool [, query: {}, sort: {}, 'new': false] } )
db.mycoll.getDB() get DB object associated with collection
db.mycoll.getIndexes()
db.mycoll.group( { key : ..., initial: ..., reduce : ...[, cond: ...] } )
db.mycoll.mapReduce( mapFunction , reduceFunction , <optional params> )
db.mycoll.remove(query)
db.mycoll.renameCollection( newName , <dropTarget> ) renames the collection.
db.mycoll.runCommand( name , <options> ) runs a db command with the given name where the first param is the collection name
db.mycoll.save(obj)
db.mycoll.stats()
db.mycoll.storageSize() - includes free space allocated to this collection
db.mycoll.totalIndexSize() - size in bytes of all the indexes
db.mycoll.totalSize() - storage allocated for all data and indexes
db.mycoll.update(query, object[, upsert_bool, multi_bool])
db.mycoll.validate() - SLOW
db.mycoll.getShardVersion() - only for use with sharding
有關數據庫和集合管理的相關命令,是最基礎和最常用的,如集合查詢、索引操作等。
基本命令及實例
下面通過實際的例子來演示一些常見的命令:
(一)基本命令
1、show dbs
{
"host" : "dev2",
"version" : "1.8.3",
"process" : "mongod",
"uptime" : 845446,
"uptimeEstimate" : 839192,
"localTime" : ISODate("2011-12-27T04:03:12.512Z"),
"globalLock" : {
"totalTime" : 845445636925,
"lockTime" : 13630973982,
"ratio" : 0.016122827283818857,
"currentQueue" : {
"total" : 0,
"readers" : 0,
"writers" : 0
},
"activeClients" : {
"total" : 0,
"readers" : 0,
"writers" : 0
}
},
"mem" : {
"bits" : 64,
"resident" : 12208,
"virtual" : 466785,
"supported" : true,
"mapped" : 466139
},
"connections" : {
"current" : 27,
"available" : 792
},
"extra_info" : {
"note" : "fields vary by platform",
"heap_usage_bytes" : 70895216,
"page_faults" : 17213898
},
"indexCounters" : {
"btree" : {
"accesses" : 4466653,
"hits" : 4465526,
"misses" : 1127,
"resets" : 0,
"missRatio" : 0.00025231420484197006
}
},
"backgroundFlushing" : {
"flushes" : 14090,
"total_ms" : 15204393,
"average_ms" : 1079.0910574875797,
"last_ms" : 669,
"last_finished" : ISODate("2011-12-27T04:02:28.713Z")
},
"cursors" : {
"totalOpen" : 3,
"clientCursors_size" : 3,
"timedOut" : 53
},
"network" : {
"bytesIn" : 63460818650,
"bytesOut" : 763926196104,
"numRequests" : 67055921
},
"opcounters" : {
"insert" : 7947057,
"query" : 35720451,
"update" : 16263239,
"delete" : 154,
"getmore" : 91707,
"command" : 68520
},
"asserts" : {
"regular" : 0,
"warning" : 1,
"msg" : 0,
"user" : 7063866,
"rollovers" : 0
},
"writeBacksQueued" : false,
"ok" : 1
}
有時,通過查看數據庫服務器的狀態,可以判斷數據庫是否存在問題,如果有問題,如數據損壞,可以及時執行修復。> db.stats()
{
"db" : "fragment",
"collections" : 12,
"objects" : 384553,
"avgObjSize" : 3028.40198360174,
"dataSize" : 1164581068,
"storageSize" : 1328351744,
"numExtents" : 109,
"indexes" : 10,
"indexSize" : 16072704,
"fileSize" : 4226809856,
"ok" : 1
}
顯示fragment數據庫的統計信息。> db.getCollectionNames()
[
"17u",
"baseSe",
"bytravel",
"daodao",
"go2eu",
"lotour",
"lvping",
"mafengwo",
"sina",
"sohu",
"system.indexes"
]
(二)基本DDL和DML
> show dbs
admin 0.03125GB
local (empty)
pagedb 0.03125GB
test 0.03125GB
> use LuceneIndexDB
switched to db LuceneIndexDB
> show dbs
admin 0.03125GB
local (empty)
pagedb 0.03125GB
test 0.03125GB
> db
LuceneIndexDB
> db.storeCollection.save({'version':'3.5', 'segment':'e3ol6'})
> show dbs
LuceneIndexDB 0.03125GB
admin 0.03125GB
local (empty)
pagedb 0.03125GB
test 0.03125GB
>
> db.createCollection('replicationColletion', {'capped':true, 'size':10240, 'max':17855200})
{ "ok" : 1 }
> show collections
replicationColletion
storeCollection
system.indexes
4、刪除集合
刪除集合,可以執行db.mycoll.drop()。
5、插入更新記錄
直接使用集合的save方法,如下所示:
> db.storeCollection.save({'version':'3.5', 'segment':'e3ol6'})
更新記錄,使用save會將原來的記錄值進行覆蓋實現記錄更新。
6、查詢一條記錄
使用findOne()函數,參數爲查詢條件,可選,系統會隨機查詢獲取到滿足條件的一條記錄(如果存在查詢結果數量大於等於1)示例如下所示:
> db.storeCollection.findOne({'version':'3.5'})
{
"_id" : ObjectId("4ef970f23c1fc4613425accc"),
"version" : "3.5",
"segment" : "e3ol6"
}
7、查詢多條記錄
使用find()函數,參數指定查詢條件,不指定條件則查詢全部記錄。
8、刪除記錄
使用集合的remove()方法,參數指定爲查詢條件,示例如下所示:
> db.storeCollection.remove({'version':'3.5'})
> db.storeCollection.findOne()
null
9、創建索引
可以使用集合的ensureIndex(keypattern[,options])方法,示例如下所示:
> use pagedb
switched to db pagedb
> db.page.ensureIndex({'title':1, 'url':-1})
> db.system.indexes.find()
{ "name" : "_id_", "ns" : "pagedb.page", "key" : { "_id" : 1 }, "v" : 0 }
{ "name" : "_id_", "ns" : "pagedb.system.users", "key" : { "_id" : 1 }, "v" : 0}
{ "_id" : ObjectId("4ef977633c1fc4613425accd"), "ns" : "pagedb.page", "key" : {"title" : 1, "url" : -1 }, "name" : "title_1_url_-1", "v" : 0 }
上述,ensureIndex方法參數中,數字1表示升序,-1表示降序。
使用db.system.indexes.find()可以查詢全部索引。
10、查詢索引
我們爲集合建立的索引,那麼可以通過集合的getIndexes()方法實現查詢,示例如下所示:
> db.page.getIndexes()
[
{
"name" : "_id_",
"ns" : "pagedb.page",
"key" : {
"_id" : 1
},
"v" : 0
},
{
"_id" : ObjectId("4ef977633c1fc4613425accd"),
"ns" : "pagedb.page",
"key" : {
"title" : 1,
"url" : -1
},
"name" : "title_1_url_-1",
"v" : 0
}
]
當然,如果需要查詢系統中全部的索引,可以使用db.system.indexes.find()函數。11、刪除索引
刪除索引給出了兩個方法:
db.mycoll.dropIndex(name)
db.mycoll.dropIndexes()
第一個通過指定索引名稱,第二個刪除指定集合的全部索引。
12、索引重建
可以通過集合的reIndex()方法進行索引的重建,示例如下所示:
> db.page.reIndex()
{
"nIndexesWas" : 2,
"msg" : "indexes dropped for collection",
"ok" : 1,
"nIndexes" : 2,
"indexes" : [
{
"name" : "_id_",
"ns" : "pagedb.page",
"key" : {
"_id" : 1
},
"v" : 0
},
{
"_id" : ObjectId("4ef977633c1fc4613425accd"),
"ns" : "pagedb.page",
"key" : {
"title" : 1,
"url" : -1
},
"name" : "title_1_url_-1",
"v" : 0
}
],
"ok" : 1
}
13、統計集合記錄數
use fragment
db.baseSe.count()
統計結果,如下所示:
> use fragment
switched to db fragment
> db.baseSe.count()
36749
上述統計了數據庫fragment的baseSe集合中記錄數。14、查詢並統計結果記錄數
use fragment
db.baseSe.find().count()
find()可以提供查詢參數,然後查詢並統計結果,如下所示:
> use fragment
switched to db fragment
> db.baseSe.find().count()
36749
上述執行先根據查詢條件查詢結果,然後統計了查詢數據庫fragment的baseSe結果記錄集合中記錄數。
15、查詢指定數據庫的集合當前可用的存儲空間
use fragment
> db.baseSe.storageSize()
142564096
16、查詢指定數據庫的集合分配的存儲空間
> db.baseSe.totalSize()
144096000
上述查詢結果中,包括爲集合(數據及其索引存儲)分配的存儲空間。
(三)啓動與終止
(四)安全管理
1、以安全認證模式啓動或者,也可以修改/etc/mongodb.conf,設置auth=true,重啓mongod進程。
{
"user" : "admin",
"readOnly" : false,
"pwd" : "995d2143e0bf79cba24b58b3e41852cd"
}
3、安全認證{
"user" : "admin",
"readOnly" : false,
"pwd" : "995d2143e0bf79cba24b58b3e41852cd"
}
db.system.users.find()
{ "_id" : ObjectId("4ef940a13c1fc4613425acc8"), "user" : "admin", "readOnly" : false, "pwd" : "995d2143e0bf79cba24b58b3e41852cd" }
否則,認證失敗,則執行相關命令會提示錯誤:db.system.users.find()
error: {
"$err" : "unauthorized db:admin lock type:-1 client:127.0.0.1", "code" : 10057
}
4、爲數據庫寫數據(同步到磁盤)加鎖說明:
{
"info" : "now locked against writes, use db.$cmd.sys.unlock.findOne() to unlock",
"ok" : 1
}
{
"inprog" : [ ],
"fsyncLock" : 1,
"info" : "use db.$cmd.sys.unlock.findOne() to terminate the fsync write/snapshot lock"
}
其中,fsyncLock爲1表示MongoDB的fsync進程(負責將寫入改變同步到磁盤)不允許其他進程執行寫數據操作db.$cmd.sys.unlock.findOne()
{ "ok" : 1, "info" : "unlock requested" }
可以執行命令查看鎖狀態:db.currentOp()
狀態信息如下:
{ "inprog" : [ ] }
說明當前沒有鎖,可以執行寫數據操作。(五)數據備份、恢復與遷移管理
cd testbak
mongodump
2、備份指定數據庫
mongodump -d pagedb
mongorestore --drop
mongorestore -d pagedb --drop
說明:將備份的pagedb的數據恢復到數據庫。
mongorestore -d pagedb -c page --drop
說明:將備份的pagedb的的page集合的數據恢復到數據庫。
--type支持的類型有三個:csv、tsv、json
其他各個選項的使用,可以查看幫助:
mongoimport --help
options:
--help produce help message
-v [ --verbose ] be more verbose (include multiple times for more
verbosity e.g. -vvvvv)
-h [ --host ] arg mongo host to connect to ( <set name>/s1,s2 for sets)
--port arg server port. Can also use --host hostname:port
--ipv6 enable IPv6 support (disabled by default)
-u [ --username ] arg username
-p [ --password ] arg password
--dbpath arg directly access mongod database files in the given
path, instead of connecting to a mongod server -
needs to lock the data directory, so cannot be used
if a mongod is currently accessing the same path
--directoryperdb if dbpath specified, each db is in a separate
directory
-d [ --db ] arg database to use
-c [ --collection ] arg collection to use (some commands)
-f [ --fields ] arg comma separated list of field names e.g. -f name,age
--fieldFile arg file with fields names - 1 per line
--ignoreBlanks if given, empty fields in csv and tsv will be ignored
--type arg type of file to import. default: json (json,csv,tsv)
--file arg file to import from; if not specified stdin is used
--drop drop collection first
--headerline CSV,TSV only - use first line as headers
--upsert insert or update objects that already exist
--upsertFields arg comma-separated fields for the query part of the
upsert. You should make sure this is indexed
--stopOnError stop importing at first error rather than continuing
--jsonArray load a json array, not one item per line. Currently
limited to 4MB.
8、從向MongoDB導出數據說明:將pagedb數據庫中page集合的數據導出到pages.csv文件,其中各選項含義:
-f 指定cvs列名爲_id,title,url,spiderName,pubDate
-q 指定查詢條件
其他各個選項的使用,可以查看幫助:
mongoexport --help
options:
--help produce help message
-v [ --verbose ] be more verbose (include multiple times for more verbosity e.g. -vvvvv)
-h [ --host ] arg mongo host to connect to ( <set name>/s1,s2 for sets)
--port arg server port. Can also use --host hostname:port
--ipv6 enable IPv6 support (disabled by default)
-u [ --username ] arg username
-p [ --password ] arg password
--dbpath arg directly access mongod database files in the given
path, instead of connecting to a mongod server -
needs to lock the data directory, so cannot be used
if a mongod is currently accessing the same path
--directoryperdb if dbpath specified, each db is in a separate directory
-d [ --db ] arg database to use
-c [ --collection ] arg collection to use (some commands)
-f [ --fields ] arg comma separated list of field names e.g. -f name,age
--fieldFile arg file with fields names - 1 per line
-q [ --query ] arg query filter, as a JSON string
--csv export to csv instead of json
-o [ --out ] arg output file; if not specified, stdout is used
--jsonArray output to a json array rather than one object per line
mongoexport -d page -c Article -q '{"spiderName": "mafengwoSpider"}' -f _id,title,content,images,publishDate,spiderName,url --jsonArray > mafengwoArticle.txt
否則,就會出現下面的錯誤:ERROR: too many positional options
(六)遠程連接管理
mongo -u admin -p admin 192.168.0.197:27017/pagedb
通過mongo實現連接,可以非常靈活的選擇參數選項,參看命令幫助,如下所示:mongo --help
MongoDB shell version: 1.8.3
usage: mongo [options] [db address] [file names (ending in .js)]
db address can be:
foo foo database on local machine
192.169.0.5/foo foo database on 192.168.0.5 machine
192.169.0.5:9999/foo foo database on 192.168.0.5 machine on port 9999
options:
--shell run the shell after executing files
--nodb don't connect to mongod on startup - no 'db address'
arg expected
--quiet be less chatty
--port arg port to connect to
--host arg server to connect to
--eval arg evaluate javascript
-u [ --username ] arg username for authentication
-p [ --password ] arg password for authentication
-h [ --help ] show this usage information
--version show version information
--verbose increase verbosity
--ipv6 enable IPv6 support (disabled by default)
> var x = new Mongo('192.168.0.197:27017')
> var ydb = x.getDB('pagedb');
> use ydb
switched to db ydb
> db
ydb
> ydb.page.findOne()
{
"_id" : ObjectId("4eded6a5bf3bfa0014000003"),
"content" : "巴黎是浪漫的城市,可是...",
"pubdate" : "2006-03-19",
"title" : "巴黎:從布魯塞爾趕到巴黎",
"url" : "http://france.bytravel.cn/Scenery/528/cblsegdbl.html"
}
上述通過MongoDB提供的JavaScript腳本,實現對另一個遠程數據庫服務器進行連接,操作指定數據庫pagedb的page集合。> var x = new Mongo('192.168.0.197:27017')
> var ydb = x.getDB('pagedb', 'shirdrn', '(jkfFS$343$_\=\,.F@3');
> use ydb
switched to db ydb