文章目錄
Redis-刪除策略以及逐出(淘汰)策略篇
Redis簡介
Redis 是 C 語言開發的一個高性能鍵值對(key -value
) 內存數據庫,可以用作數據庫,緩存和消息中間件等。
特點
-
作爲內存數據庫,它的性能非常優秀,數據存儲在內存當中,讀寫速度非常快,支持併發
10W QPS
(每秒查詢次數),單進程單線程,是線程安全的,採用IO
多路複用機制。 -
豐富的數據類型,支持字符串,散列,列表,集合,有序集合等,支持數據持久化。可以將內存中數據保存在磁盤中,重啓時加載。
-
主從複製,哨兵,高可用,可用作分佈式鎖。可以作爲消息中間件使用,支持發佈訂閱。
刪除策略以及逐出策略
什麼是過期數據?
Redis 是一種內存數據庫,所有數據都存放在內存中,內存中的數據可以通過 TTL
指令獲取其狀態。
如:
- XX:具有時效性的數據,通過下列命令來定義:
setex key seconds value
expire key seconds
expireat key timestamp
pexpire key milliseconds
pexpireat key milliseconds-timestamp
- -1:永久有效的數據
- -2:已過期 | 未定義 | 已刪除的數據
redis存儲 key-value 結構:
代碼
/* Redis database representation. There are multiple databases identified
* by integers from 0 (the default database) up to the max configured
* database. The database number is the 'id' field in the structure. */
typedef struct redisDb {
dict *dict; /* 數據庫鍵空間,保存所有鍵值對信息 The keyspace for this DB */
dict *expires; /* 鍵的有效期信息 Timeout of keys with a timeout set */
dict *blocking_keys; /* Keys with clients waiting for data (BLPOP)*/
dict *ready_keys; /* Blocked keys that received a PUSH */
dict *watched_keys; /* 實現監控 WATCHED keys for MULTI/EXEC CAS */
int id; /* 數據庫號,標記是哪一個數據庫的 Database ID */
long long avg_ttl; /* Average TTL, just for stats */
unsigned long expires_cursor; /* Cursor of the active expire cycle. */
list *defrag_later; /* List of key names to attempt to defrag one by one, gradually. */
} redisDb;
在這裏用到了redisDB
這個結構體的:
dict *dict;
—>數據庫鍵空間,保存所有鍵值對信息dict *expires;
—>鍵的有效期信息int id;
—>數據庫號
如圖一:
過期的數據是否真的被刪除了?
過期數據:指的是曾經設置過過期時間的數據,到達了它的過期時間失效。
當 redis
需要處理某條數據的時候,發送一條指令給 CPU
,CPU
輕輕鬆鬆就可以搞定,相對來說不會佔用太多時間,但是如果有多個 redis
同時發送了非常多的增刪查改指令過來,那 CPU
壓力就會變得非常大,造成性能下降,所有操作都在排隊等着 CPU
空閒處理指令。那麼,我們在這裏能不能做一個優化,查數據,加數據,改數據這部分還是得正常進來處理,但是過期數據貌似就不是一個很着急的事情了。如果內存空間也不是很緊張,沒達到閾值,那可以先放在內存裏,等有空的時間再刪掉。也就是說,當這些數據過期以後,實際上還是先放在內存裏等到要刪的時候再去刪它。而具體怎麼刪,Redis
會提供相應的刪除策略。
Redis提供的刪除策略
Redis 提供了三種刪除策略:1. 定時刪除
| 2. 惰性刪除
| 2. 定期刪除
數據刪除策略的目標就是在
內存
和CPU
佔用之間尋找一種平衡,避免某一邊壓力過大造成整體性能下降,甚至引發服務器宕機或內存泄露。
定時刪除
當 key
設置過期時間的時候,創建一個定時器事件,當 key
過期時間到達時,由定時器任務立即執行對 key
的刪除操作,刪除操作先刪除存儲空間的,再移除掉 expire
的 key
優點:節約內存,到時就刪除,快速釋放掉不必要的內存佔用
缺點:CPU
壓力大,無論 CPU
此時負載量多高,都會去佔用 CPU
進行 key
的刪除操作,會影響 Redis
服務器響應時間和吞吐量,是一種比較低效的方式
結論:用 CPU
性能換取內存空間,時間換空間
惰性刪除|被動刪除
數據到達超時時間的,不立即處理,等下次訪問該數據的時候,再去刪除(操作會執行expireIfNeeded()
函數去檢查)
優點:不佔用 CPU
節約 CPU
性能,只在獲取訪問key
的時候才判斷是否過期,過期則刪除,只會刪除當前獲取的這一個key
,其他的key
還是保持原樣
缺點:內存佔用大,如果一直沒有獲取它,那麼數據就會長期佔用內存空間,當有大量的key
沒有被使用到,也造成了大量內存浪費,對內存數據庫來說,也不太友好
結論:空間換時間
過期刪除調用的幾個主要函數 db.c
int expireIfNeeded(redisDb *db, robj *key)
int keyIsExpired(redisDb *db, robj *key)
long long getExpire(redisDb *db, robj *key)
notifyKeyspaceEvent(NOTIFY_EXPIRED,"expired",key,db->id);
server.lazyfree_lazy_expire ? dbAsyncDelete(db,key) : dbSyncDelete(db,key);
int expireIfNeeded(redisDb *db, robj *key) {
if (!keyIsExpired(db,key)) return 0; //未過期的key
/* If we are running in the context of a slave, instead of
* evicting the expired key from the database, we return ASAP:
* the slave key expiration is controlled by the master that will
* send us synthesized DEL operations for expired keys.
*
* Still we try to return the right information to the caller,
* that is, 0 if we think the key should be still valid, 1 if
* we think the key is expired at this time. */
if (server.masterhost != NULL) return 1;
/* Delete the key */
server.stat_expiredkeys++;
propagateExpire(db,key,server.lazyfree_lazy_expire);
notifyKeyspaceEvent(NOTIFY_EXPIRED,
"expired",key,db->id);
//刪除操作
int retval = server.lazyfree_lazy_expire ? dbAsyncDelete(db,key) :
dbSyncDelete(db,key);
if (retval) signalModifiedKey(NULL,db,key);
return retval;
}
定期刪除|主動刪除
前面說的兩種方案1.時間換空間
,2.空間換時間
都是兩個極端方法,爲避免前面方案帶來的問題,Redis 引入了定期刪除策略(是他們的一個比較折中的方案
)
週期性輪詢
Redis
庫中的時效性數據,採取隨機抽取的策略,利用過期數據佔比的方式控制刪除頻度。
- 在Redis服務器初始化時,讀取
server.hz
的值,默認值爲10
。- 定時輪詢服務器,每秒鐘執行
server.hz
次serverCron()
函數。 databaseCron()
在後臺輪詢處理 16 個 redis 數據庫的操作,如這裏的過期 key 的處理activeExpireCycle()
,對每個數據庫的expire
空間進行檢測,每次執行250ms/server.hz
- 定時輪詢服務器,每秒鐘執行
- 隨機選取一批
expire
空間的 key(redis有16個數據庫,從0號數據庫開始---15號數據庫
)- 刪除這批 key 中已過期的
- 如果這批 key 中已過期的佔比超過25%,那麼再重複執行步驟一。(
循環到小於25%結束當前數據庫的刪除
) - 如果這批 key 中已過期的佔比 ≤ 25%,檢測下一個數據庫的
expire
空間(current_db++
)
用info命令查看相關配置參數,如:server.hz
配置
代碼位置:
server.c
/*
This is our timer interrupt, called server.hz times per second.
Here is where we do a number of things that need to be done asynchronously.
For instance:
Active expired keys collection (it is also performed in a lazy way on lookup).
..............
*/
int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData){
/* Handle background operations on Redis databases. */
databasesCron();
//略............
}
server.c
/*
This function handles 'background' operations we are required to do incrementally in Redis databases, such as active key expiring, resizing, rehashing.
*/
void databasesCron(void) {
/* Expire keys by random sampling. Not required for slaves
* as master will synthesize DELs for us. */
if (server.active_expire_enabled) {
if (iAmMaster()) {
activeExpireCycle(ACTIVE_EXPIRE_CYCLE_SLOW);
} else {
expireSlaveKeys();
}
}
//略............
}
expire.c
void activeExpireCycle(int type){
//代碼太長了不放了,主要執行流程是
隨機選取一批expire空間的key(從0號數據庫開始---15號數據庫)
刪除這批key中已過期的
如果這批key中已過期的佔比超過25%,那麼再重複執行步驟一。(`循環到小於25%結束當前數據庫的刪除`)
如果這批key中已過期的佔比 ≤ 25%,檢測下一個數據庫的expire空間(`current_db++)
}
除了主動淘汰的頻率外,Redis 對每次淘汰任務執行的最大時長也有一個限定,這樣保證了每次主動淘汰不會過多阻塞應用請求,以下是這個限定計算公式:
#define ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC 25 /* Max % of CPU to use. */
/* Adjust the running parameters according to the configured expire
* effort. The default effort is 1, and the maximum configurable effort
* is 10. */
unsigned long effort = server.active_expire_effort-1, /* Rescale from 0 to 9. */
unsigned long config_cycle_slow_time_perc = ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC + 2*effort;
/* We can use at max 'config_cycle_slow_time_perc' percentage of CPU
* time per iteration. Since this function gets called with a frequency of
* server.hz times per second, the following is the max amount of
* microseconds we can spend in this function. */
timelimit = config_cycle_slow_time_perc*1000000/server.hz/100;
結論:CPU 性能佔用設置有峯值,檢測頻度可自定義設置,內存壓力不是很大,長期佔用內存的冷數據會被持續清理(週期性隨機抽查,重點抽查)
刪除策略比對
- 定時刪除(時間換空間)
- 節約內存無佔用
- 不分時段佔用 CPU 資源,頻度高
- 惰性刪除(空間換時間)
- 內存佔用高
- 延遲執行,不會一直佔用CPU資源,CPU 壓力小,頻度低
- 定期刪除(週期性隨機抽查)
- 內存定期隨機清理
- 每秒花費固定 CPU 資源維護內存(清除過期數據)
逐出(淘汰)策略
在Redis
中經常會進行數據的增刪查改操作,那麼如果在添加數據的時候遇到了內存不足,該怎麼辦?在前面用的刪除策略可以避免出現這種情況嗎?
實際上,在前面所說的刪除策略,它針對的是expire
命令進行的操作,也就是說那些具有時效性的數據(已經過期,並且還在佔用內存的數據
),我們在這裏說的是針對那些並沒有過期,或者是內存中的數據沒有一個帶有有效期,全是永久性數據,這時候刪除策略就不起作用了,所以這個時候內存滿了我們再去插入數據到內存是怎麼做?
介紹
Redis在進行存儲操作的時候,會先幹一件事,在執行每一個命令前都會去調用freeMemoryIfNeeded(void)
方法去檢測內存是否充足,如果內存不滿足新加入數據最低存儲要求,則需要臨時刪除一些數據爲當前數據騰出存儲空間。清理數據的方策叫做逐出(淘汰)算法。
逐出(淘汰)算法不是100%
能清理出足夠的可使用的內存空間,如果不成功則反覆執行。當對所有數據嘗試完成後,還是不能達到要求的話,就會報錯。
步驟大致如下:
redis.windows-service.conf
|redis.conf
中配置maxmemory <bytes>
限制內存使用量爲100mb
–>maxmemory 100mb
,默認值設置爲 0 則表示內存不限制,通常設置佔物理內存的50%
以上
redis.windows-service.conf
|redis.conf
中配置maxmemory-samples x
每次選取刪除數據的個數,選取數據時並不會全庫掃描而導致嚴重的性能消耗降低讀寫性能,因此採用隨機獲取數據的方式作爲待檢測刪除數據。
redis.windows-service.conf
|redis.conf
中配置maxmemory-policy noeviction
刪除策略,默認是noeviction
當redis
內存超出限制時,觸發逐出(淘汰)機制,對被挑選出來的數據進行刪除。
代碼流程:
redis用int processCommand(client *c)
函數處理每條命令,在這個函數裏回去調用int freeMemoryIfNeededAndSafe(void)
方法來判斷內存空間
int processCommand(client *c) {
//..............略
/* Handle the maxmemory directive.
*
* Note that we do not want to reclaim memory if we are here re-entering
* the event loop since there is a busy Lua script running in timeout
* condition, to avoid mixing the propagation of scripts with the
* propagation of DELs due to eviction. */
if (server.maxmemory && !server.lua_timedout) {
int out_of_memory = freeMemoryIfNeededAndSafe() == C_ERR;
/* freeMemoryIfNeeded may flush slave output buffers. This may result
* into a slave, that may be the active client, to be freed. */
if (server.current_client == NULL) return C_ERR;
/* It was impossible to free enough memory, and the command the client
* is trying to execute is denied during OOM conditions or the client
* is in MULTI/EXEC context? Error. */
if (out_of_memory &&
(c->cmd->flags & CMD_DENYOOM ||
(c->flags & CLIENT_MULTI &&
c->cmd->proc != execCommand &&
c->cmd->proc != discardCommand)))
{
flagTransaction(c);
addReply(c, shared.oomerr);
return C_OK;
}
/* Save out_of_memory result at script start, otherwise if we check OOM
* untill first write within script, memory used by lua stack and
* arguments might interfere. */
if (c->cmd->proc == evalCommand || c->cmd->proc == evalShaCommand) {
server.lua_oom = out_of_memory;
}
}
//..............略
}
int freeMemoryIfNeededAndSafe(void)
則會去調用真正判斷內存的freeMemoryIfNeeded()
函數來判斷當前使用的內存是否超過了最大使用內存
/* This is a wrapper for freeMemoryIfNeeded() that only really calls the
* function if right now there are the conditions to do so safely:
*
* - There must be no script in timeout condition.
* - Nor we are loading data right now.
*
*/
int freeMemoryIfNeededAndSafe(void) {
if (server.lua_timedout || server.loading) return C_OK;
return freeMemoryIfNeeded();
}
int freeMemoryIfNeeded(void)
這個函數開始進行內存計算,進一步選出需要淘汰的鍵
/* This function is periodically called to see if there is memory to free
* according to the current "maxmemory" settings. In case we are over the
* memory limit, the function will try to free some memory to return back
* under the limit.
*
* The function returns C_OK if we are under the memory limit or if we
* were over the limit, but the attempt to free memory was successful.
* Otehrwise if we are over the memory limit, but not enough memory
* was freed to return back under the limit, the function returns C_ERR. */
int freeMemoryIfNeeded(void) {
int keys_freed = 0;
/* By default replicas should ignore maxmemory
* and just be masters exact copies. */
if (server.masterhost && server.repl_slave_ignore_maxmemory) return C_OK;
size_t mem_reported, mem_tofree, mem_freed;
mstime_t latency, eviction_latency, lazyfree_latency;
long long delta;
int slaves = listLength(server.slaves);
int result = C_ERR;
/* When clients are paused the dataset should be static not just from the
* POV of clients not being able to write, but also from the POV of
* expires and evictions of keys not being performed. */
if (clientsArePaused()) return C_OK;
if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK)
return C_OK;
mem_freed = 0;
latencyStartMonitor(latency);
if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)
goto cant_free; /* We need to free memory, but policy forbids. */
while (mem_freed < mem_tofree) {
int j, k, i;
static unsigned int next_db = 0;
sds bestkey = NULL;
int bestdbid;
redisDb *db;
dict *dict;
dictEntry *de;
if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) || server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)
{
struct evictionPoolEntry *pool = EvictionPoolLRU;
while(bestkey == NULL) {
unsigned long total_keys = 0, keys;
/* We don't want to make local-db choices when expiring keys,
* so to start populate the eviction pool sampling keys from
* every DB. */
for (i = 0; i < server.dbnum; i++) {
db = server.db+i;
dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ? db->dict : db->expires;
if ((keys = dictSize(dict)) != 0) {
evictionPoolPopulate(i, dict, db->dict, pool);
total_keys += keys;
}
}
if (!total_keys) break; /* No keys to evict. */
/* Go backward from best to worst element to evict. */
for (k = EVPOOL_SIZE-1; k >= 0; k--) {
if (pool[k].key == NULL) continue;
bestdbid = pool[k].dbid;
if (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {
de = dictFind(server.db[pool[k].dbid].dict,
pool[k].key);
} else {
de = dictFind(server.db[pool[k].dbid].expires,
pool[k].key);
}
/* Remove the entry from the pool. */
if (pool[k].key != pool[k].cached)
sdsfree(pool[k].key);
pool[k].key = NULL;
pool[k].idle = 0;
/* If the key exists, is our pick. Otherwise it is
* a ghost and we need to try the next element. */
if (de) {
bestkey = dictGetKey(de);
break;
} else {
/* Ghost... Iterate again. */
}
}
}
}
/* volatile-random and allkeys-random policy */
else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM ||
server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)
{
/* When evicting a random key, we try to evict a key for
* each DB, so we use the static 'next_db' variable to
* incrementally visit all DBs. */
for (i = 0; i < server.dbnum; i++) {
j = (++next_db) % server.dbnum;
db = server.db+j;
dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ?
db->dict : db->expires;
if (dictSize(dict) != 0) {
de = dictGetRandomKey(dict);
bestkey = dictGetKey(de);
bestdbid = j;
break;
}
}
}
/* Finally remove the selected key. */
if (bestkey) {
db = server.db+bestdbid;
robj *keyobj = createStringObject(bestkey,sdslen(bestkey));
propagateExpire(db,keyobj,server.lazyfree_lazy_eviction);
/* We compute the amount of memory freed by db*Delete() alone.
* It is possible that actually the memory needed to propagate
* the DEL in AOF and replication link is greater than the one
* we are freeing removing the key, but we can't account for
* that otherwise we would never exit the loop.
*
* AOF and Output buffer memory will be freed eventually so
* we only care about memory used by the key space. */
delta = (long long) zmalloc_used_memory();
latencyStartMonitor(eviction_latency);
if (server.lazyfree_lazy_eviction)
dbAsyncDelete(db,keyobj);
else
dbSyncDelete(db,keyobj);
signalModifiedKey(NULL,db,keyobj);
latencyEndMonitor(eviction_latency);
latencyAddSampleIfNeeded("eviction-del",eviction_latency);
delta -= (long long) zmalloc_used_memory();
mem_freed += delta;
server.stat_evictedkeys++;
notifyKeyspaceEvent(NOTIFY_EVICTED, "evicted",
keyobj, db->id);
decrRefCount(keyobj);
keys_freed++;
/* When the memory to free starts to be big enough, we may
* start spending so much time here that is impossible to
* deliver data to the slaves fast enough, so we force the
* transmission here inside the loop. */
if (slaves) flushSlavesOutputBuffers();
/* Normally our stop condition is the ability to release
* a fixed, pre-computed amount of memory. However when we
* are deleting objects in another thread, it's better to
* check, from time to time, if we already reached our target
* memory, since the "mem_freed" amount is computed only
* across the dbAsyncDelete() call, while the thread can
* release the memory all the time. */
if (server.lazyfree_lazy_eviction && !(keys_freed % 16)) {
if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {
/* Let's satisfy our stop condition. */
mem_freed = mem_tofree;
}
}
} else {
goto cant_free; /* nothing to free... */
}
}
result = C_OK;
cant_free:
/* We are here if we are not able to reclaim memory. There is only one
* last thing we can try: check if the lazyfree thread has jobs in queue
* and wait... */
if (result != C_OK) {
latencyStartMonitor(lazyfree_latency);
while(bioPendingJobsOfType(BIO_LAZY_FREE)) {
if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {
result = C_OK;
break;
}
usleep(1000);
}
latencyEndMonitor(lazyfree_latency);
latencyAddSampleIfNeeded("eviction-lazyfree",lazyfree_latency);
}
latencyEndMonitor(latency);
latencyAddSampleIfNeeded("eviction-cycle",latency);
return result;
}
逐出(淘汰)算法策略及其相關配置
random:在expire空間或者dict空間隨機淘汰。
volatile:在expire空間先淘汰到期或快到期數據。
allkeys:在dict空間查找
近似 LRU 算法(最近最少使用Least Recently Used)
近似 LFU 算法 (最近使用次數最少Least Frequently Used)
1. 檢測帶有時效性的數據進行淘汰(第i個數據庫的expire空間
)
volatile-lru
:在設置了時效性的 keys 中選擇最近最少使用的數據淘汰(Evict using approximated LRU, only keys with an expire set.
)volatile-lfu
:在設置了時效性的 keys 中選擇最近使用次數最少的數據淘汰(Evict using approximated LFU, only keys with an expire set.
)volatile-random
:在設置了時效性的 keys 中隨機選擇一個淘汰(Remove a random key having an expire set.
)volatile-ttl
:在設置了時效性的 keys 中選擇最快過期TTL最短的數據淘汰(Remove the key with the nearest expire time (minor TTL)
)
2. 檢測全庫的數據進行淘汰(第i個數據庫的dict空間
)
allkeys-lru
:在所有 key 中使用最近最少使用的數據淘汰(Evict any key using approximated LRU.
)allkeys-lfu
:在所有 key 中使用最近使用次數最少的數據淘汰(Evict any key using approximated LFU.
)allkeys-random
:在所有 key 中隨機選擇一個淘汰(Remove a random key, any key.
)
不同的策略,指向的數據集也不同:根據指向expire
的空間還是dict
空間來刪除,主要可以看下面這兩段代碼可以看出:
if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||
server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)
{
//根據淘汰策略選擇一個空間dict空間或expire空間
dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ? db->dict : db->expires;
}
/* volatile-random and allkeys-random policy */
else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM || server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)
{
//根據淘汰策略選擇一個空間dict空間或expire空間
dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ? db->dict : db->expires;
}
3. 不進行淘汰–NO_EVICTION
noeviction
:不淘汰任何東西,僅在寫操作時返回一個錯誤(Don't evict anything, just return an error on write operations.
)目前(redis_version:3.2.100
)版本默認是配置noeviction
策略。容易引發OOM
if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)
goto cant_free; /* We need to free memory, but policy forbids. */