Memcached - Base

Memcached

標籤 : Java與NoSQL


在程序的實現中, 經常會忽略程序的運行時間. 即使採用類似的實現方法, 有時候運行速度也會相差很多. 大多數情況下, 這一速度上的差異是由數據訪問速度的差異所導致的.

– 松本行弘<代碼的未來>

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.
Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.


編譯

  • 環境
    編譯Memcached需要使用gcc/make/cmake/autoconf/libtool等工具:
yum install gcc make cmake autoconf libtool

Memcached的事件循環機制基於libevent庫:

wget https://github.com/libevent/libevent/releases/download/release-2.0.22-stable/libevent-2.0.22-stable.tar.gz
./configure --prefix=/usr/local/libevent
make && make install
  • Memcached
wget http://memcached.org/files/memcached-1.4.25.tar.gz
./configure --prefix=/usr/local/memcached --with-libevent=/usr/local/libevent
make && make install
/usr/local/memcached/bin/memcached -vv -d               #啓動
  • Manual
-p <num>      TCP port number to listen on (default: 11211)
-U <num>      UDP port number to listen on (default: 11211, 0 is off)
-s <file>     UNIX socket path to listen on (disables network support)
-A            enable ascii "shutdown" command
-a <mask>     access mask for UNIX socket, in octal (default: 0700)
-l <addr>     interface to listen on (default: INADDR_ANY, all addresses)
              <addr> may be specified as host:port. If you don\'t specify
              a port number, the value you specified with -p or -U is
              used. You may specify multiple addresses separated by comma
              or by using -l multiple times
-d            run as a daemon
-u <username> assume identity of <username> (only when run as root)
-m <num>      max memory to use for items in megabytes (default: 64 MB) ## 設置最大使用內存
-c <num>      max simultaneous connections (default: 1024)              ## 設置最大連接數
-v            verbose (print errors/warnings while in event loop)
-vv           very verbose (also print client commands/reponses)
-f <factor>   chunk size growth factor (default: 1.25)                  ## 設置增長因子
-n <bytes>    minimum space allocated for key+value+flags (default: 48)
-t <num>      number of threads to use (default: 4)
-R            Maximum number of requests per event, limits the number of
              requests process for a given connection to prevent 
              starvation (default: 20)
-b <num>      Set the backlog queue limit (default: 1024)
-I            Override the size of each slab page. Adjusts max item size
              (default: 1mb, min: 1k, max: 128m)
-F            Disable flush_all command
  • Client
    Memcached可以通過C、C++、Java、PHP、Python、Ruby、Perl、Erlang、Lua等語言來訪問.此外, Memcached的通信協議由簡單文本構成, 使用如Telnet的方式也很容易訪問:
telnet host port

命令

Memcached提供的命令簡單易懂, 因此在此只做簡單介紹, 詳細可參考官方Wiki的Commands部分.


Storage Commands

<command name> <key> <flag> <expire> <bytes>  
<data block>

Memcached提供的存儲的命令如下:

命令 描述
set Store this data, possibly overwriting any existing data. New items are at the top of the LRU.
add Store this data, only if it does not already exist. New items are at the top of the LRU. If an item already exists and an add fails, it promotes the item to the front of the LRU anyway.
replace Store this data, but only if the data already exists. Almost never used, and exists for protocol completeness (set, add, replace, etc)
append Add this data after the last byte in an existing item. This does not allow you to extend past the item limit. Useful for managing lists.
prepend Same as append, but adding new data before existing data.
cas Check And Set (or Compare And Swap). An operation that stores data, but only if no one else has updated the data since you read it last. Useful for resolving race conditions on updating cache data.

Storage Commands後面參數的形式/含義相同:

參數 描述
key 鍵: 類似於Map內的key,不能重複
flag 自定義標誌位(正整數)
expire 失效時間 (/s)
bytes 緩存內容長度
注意: 命令cas的格式稍有不同
cas <key> <flag> <expire> <bytes> <value> 
<data block>

只有當value值與gets(見下)返回的identifier數字一致時纔會生效,否則返回EXISTS.


Retrieval Commands

get/gets <key>
參數 描述
get Command for retrieving data. Takes one or more keys and returns all found items.
gets An alternative get command for using with CAS. Returns a CAS identifier (a unique 64bit number) with the item. Return this value with the cas command. If the item’s CAS value has changed since you gets’ed it, it will not be stored.

Delete

delete key [time]
Removes an item from the cache, if it exists. time可選,指定rm該key的time秒內,不允許操作該key.

Incr/Decr

incr/decr <key> <number>
Increment and Decrement. If an item stored is the string representation of a 64bit integer, you may run incr or decr commands to modify that number. You may only incr by positive values, or decr by positive values. They does not accept negative values.
  • 應用場景: 秒殺
    原先秒殺的下單過程的所有操作都通過數據庫,比如讀取庫存/寫入訂單/更新庫存/收繳欠款等, 響應緩慢且對數據庫壓力具大,現在可將與庫存相關操作都放到Memcached內: 在Memcached中設置一個count(庫存量),每個秒殺decr之,成功之後再進行後面的一系列下單操作,由於主要在內存操作,速度非常快.

Statistics

參數 描述
stats ye ‘ole basic stats command. 如計算統計緩存命中率: (get_hits / cmd_get) * 100%
stats items Returns some information, broken down by slab, about items stored in memcached.
stats slabs Returns more information, broken down by slab, about items stored in memcached. More centered to performance of a slab rather than counts of particular items.
stats sizes A special command that shows you how items would be distributed if slabs were broken into 32byte buckets instead of your current number of slabs. Useful for determining how efficient your slab sizing is.

Flush

flush_all
Invalidate all existing cache items. Optionally takes a parameter, which means to invalidate all items after N seconds have passed.
This command does not pause the server, as it returns immediately. It does not free up or flush memory at all, it just causes all items to expire.

限制

參數 限制
key長度限制 文本協議支持250B, 二進制協議支持 65536B
value限制 1M
總內存限制 32位操作系統最大支持2G

Memcached Slab Allocator

Memcached使用Slab Allocator機制來管理內存, 緩解內存碎片化問題:

Memcached啓動時首先向操作系統申請一大塊內存,並將其分割成各種尺寸的Chunk,並將尺寸相同的Chunk分成組Slab Class.
其中,Chunk塊就是用來存儲key-value數據的最小單位; 而每個Slab Class的大小可以在Memcached啓動的時通過指定-f參數設置(默認1.25,所以如果第一組Chunk的大小爲88B,第二組Chunk爲122B,依此類推).

當Memcached收到客戶端發送過來的數據時會首先根據數據大小選擇一個最合適的Slab Class,然後通過查詢該Slab Class內空閒Chunk列表緩存數據.

當一條數據過期/delete時,該記錄所佔用的Chunk就可以回收,重新添加到空閒列表中.


Growth Factor調優

從以上過程可以看到Memcached內存管理制效率非常高,不會造成內存碎片,但它最大的缺點是會導致內存空間浪費:

因爲每個Chunk的長度是“固定”的,所以變長數據無法充分利用這些空間:

如圖:將100字節的數據緩存到128字節的Chunk中,剩餘28個字節就浪費掉了.

Chunk空間的浪費問題無法徹底解決,只能緩解: 比如開發者預先對緩存數據長度進行估計, 並指定合理的Chunk大小. 但可惜的是,Memcached目前還不支持自定義Chunk的大小,但可以通過-f參數來調整Slab Class內Chunk的Growth Factor(增長因子):

/usr/local/memcached/bin/memcached -u nobody -vvv -f 1.25

注意:當f=1.25時,從Memcached輸出結果來看,某些相鄰的Slab Class大小比值並非精確的1.25,這些誤差是爲了保持字節對齊而故意設置的.


Memcached數據過期

Memcached會優先使用已超時Item空間,即便如此,也會發生追加新記錄時空間不
足的情況,此時就要使用Least Recently Used(LRU)機制來分配空間:

Memcached爲每個Item維持一個計數器count, 當某個Item被請求時,count+1,當空間不足時,通過count斷其中某個Item最近最少使用, 然後踢出.


Lazy Expiration

Memcached內部不會監視記錄是否過期,而是當get時查看記錄時間戳,以檢查否過
期. 這種技術被稱爲Lazy Expiration, 好處是不會在Item過期監視上耗費CPU時間.


永久數據被踢

“Memcached數據丟失”(永久數據被踢): 明明將key設爲永久有效,卻莫名其妙的丟失了.

這種情況需要從以下幾個方面分析:

1. 如果 Slab裏的很多Chunk已經過期,但過期後並未被get,Memcached則不知道他們已經過期.
2. 永久數據很久沒get,不活躍, 同時Memcached相關Slab內存緊張, 如果新增Item, 很可能被LRU機制踢出.

解決方案: 永久數據和非永久數據分開存放.


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章