Fast paging with MongoDB

原創

2018-09-04 23:50

Paging through your data is one of the most common operations with MongoDB. A typical scenario is the need to display your results in chunks in your UI. If you are batch processing your data it is also important to get your paging strategy correct so that your data processing can scale.

Lets walk through an example to see the different ways of paging through data in MongoDB. In this example we have a CRM database of user data that we need to page through and display 10 users at a time. So in effect（有效; 實際上，事實上） our page size is 10. Here is the structure of our user document

{
    _id,
    name,
    company,
    state
}

Approach 1: Using skip() and limit()

MongoDB natively supports the paging operation using the skip() and limit() commands. The skip(n) directive tells mongodb that it should skip ‘n’ results and the limit(n) directive instructs mongodb that it should limit the result length to ‘n’ results. Typically you will be using the skip() and limit() directives with your cursor - but to illustrate the scenario we provide console commands that would achieve the same results. Also for brevity of code the limits checking code is also excluded.

//Page 1
db.users.find().limit (10)
//Page 2
db.users.find().skip(10).limit(10)
//Page 3
db.users.find().skip(20).limit(10)
........

You get the idea. In general to retrieve page n the code looks like this

db.users.find().skip(pagesize*(n-1)).limit(pagesize)

However as the size of your data increases this approach has serious performance problems. The reason is that every time the query is executed the full result set is built up, then the server has to walk from the beginning of the collection to the specified offset. As your offset increases this process gets slower and slower. Also this process does not make efficeint use of the indexes. So typically the ‘skip()’ and ‘limit()’ approach is useful when you have small data sets. If you are working with large data sets you need to consider other approaches.

Approach 2: Using `find()` and `limit()`

The reason the previous approach does not scale very well is the skip() command. So the goal in this section is to implement paging without using the ‘skip()’ command. For this we are going to leverage the natural order in the stored data like a time stamp or an id stored in the document. In this example we are going to use the ‘_id’ stored in each document. ‘_id’ is a mongodb ObjectID structure which is a 12 byte structure containing timestamp, machined, processid, counter etc. The overall idea is as follows
1. Retrieve the _id of the last document in the current page
2. Retrieve documents greater than this “_id” in the next page

//Page 1
db.users.find().limit(pageSize);
//Find the id of the last document in this page
last_id = ...

//Page 2
users = db.users.find({'_id'> last_id}). limit(10);
//Update the last id with the id of the last document in this page
last_id = ...

This approach leverages the inherent order that exists in the “_id” field. Also since the “_id” field is indexed by default the performance of the find operation is very good. If the field you are using is not indexed your performance will suffer – so it is important to make sure that field is indexed.

Also if you would like your data sorted in a particular order for your paging then you can also use the sort() clause with the above technique. It is important to ensure that the sort process is leveraging an index for best performance. You can use the .explain() suffix to your query to determine this.

users = db.users.find({'_id'> last_id}). sort(..).limit(10);
//Update the last id with the id of the last document in this page
last_id = ...

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Fast paging with MongoDB

Approach 1: Using skip() and limit()

Approach 2: Using `find()` and `limit()`

PDManer [元數建模]-v4.9.0 發佈：一款簡單好用的數據庫建模平臺

使用neovim打造go ide(支持代碼跳轉, 代碼補全, 實時語法檢查)

sql求連續值問題

cs01 CSS Syntax

挑戰程序設計競賽 2.3章習題 poj 3046 Ant Counting

[MASM拾遺]Offset僞指令

h30 HTML Layout Elements

瞭解顯卡

一款基於C#開發的通訊調試工具（支持Modbus RTU、MQTT調試）

Linux/Golang/glibC系統調用

MongoDB hello world example

Java MongoDB : Update document

Java – Check if key exists in HashMap

Python Whois client example

Java – How to split a string

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

Fast paging with MongoDB

Approach 1: Using skip() and limit()

Approach 2: Using find() and limit()

Approach 2: Using `find()` and `limit()`