原文:bcache使用教程
文章內容可能更新,閱讀原文可獲得最新內容
混合存儲中flashcache和bcache是比較知名的兩個開源項目,之前文章詳述了flashcache的使用[點我查看],這篇文章描述先bcache的安裝和使用
bcache-tools 源碼:https://github.com/koverstreet/bcache-tools
系統信息
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.6 LTS
Release: 16.04
Codename: xenial
bcache安裝
bcache可以大概分爲兩個部分,一個是linux內核模塊,一個是bcache-tools。bcache內核模塊在linux內核3.10及以上才支持,所以使用bcache,需要將內核升級到3.10及以上版本才行。
安裝依賴
sudo apt install libblkid-dev
sudo apt install pkg-config
下載編譯bcache-tools
git clone https://evilpiepirate.org/git/bcache-tools.git
cd bcache-tools
make
問題: 編譯可能會報錯
$ make
cc -O2 -Wall -g `pkg-config --cflags uuid blkid` make-bcache.c bcache.o `pkg-config --libs uuid blkid` -o make-bcache
/tmp/cc5vHcs9.o:在函數‘write_sb’中:
/home/nvm/bcache-tools/make-bcache.c:277:對‘crc64’未定義的引用
collect2: error: ld returned 1 exit status
<內置>: recipe for target 'make-bcache' failed
make: *** [make-bcache] Error 1
網上搜了一下,是一個函數定義的bug[詳見:https://www.spinics.net/lists/linux-bcache/msg02847.html],修正一下即可。
解決: 打開編譯目錄下bcache.c
文件,將函數頭inline uint64_t crc64(const void *_data, size_t len)
中inline
去除即可
--- a/bcache.c
+++ b/bcache.c
@@ -115,7 +115,7 @@ static const uint64_t crc_table[256] = {
0x9AFCE626CE85B507ULL
};
-inline uint64_t crc64(const void *_data, size_t len)
+uint64_t crc64(const void *_data, size_t len)
{
uint64_t crc = 0xFFFFFFFFFFFFFFFFULL;
const unsigned char *data = _data;
安裝bcache-tools
make install
加載內核bcache模塊
linux內核3.10及以上都自帶bcache模塊,加載一下即可
$ sudo modprobe bcache
$ lsmod | grep bcache
bcache 233472 0
如果加載時報錯,請確認內核版本再重新嘗試
$ sudo modprobe bcache
modprobe: FATAL: Module bcache not found.
bcache使用
相關命令
與bcache相關的命令有:make-bcache
和bcache-super-show
$ make-bcache
\Please supply a device
Usage: make-bcache [options] device
-C, --cache Format a cache device
-B, --bdev Format a backing device
-b, --bucket bucket size
-w, --block block size (hard sector size of SSD, often 2k)
-o, --data-offset data offset in sectors
--cset-uuid UUID for the cache set
--writeback enable writeback
--discard enable discards
--cache_replacement_policy=(lru|fifo)
-h, --help display this help and exit
$ bcache-super-show
Usage: bcache-super-show [-f] <device>
查看硬盤信息
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme1n1 259:8 0 13.4G 0 disk
sda 8:0 0 223.6G 0 disk
這裏使用nvme1n1做加速盤,sda做後備盤
創建bcache混合存儲
方式一:一鍵創建
創建後端設備、創建前端緩存設備,並建立他們之間的映射關係
$ sudo make-bcache -C /dev/nvme1n1 -B /dev/sda
UUID: bb156414-f6a2-49ed-a91b-90b7b55f4d10
Set UUID: 0b9c134e-cd04-41f1-8f3f-4a783c716707
version: 0
nbuckets: 27472
block_size: 1
bucket_size: 1024
nr_in_set: 1
nr_this_dev: 0
first_bucket: 1
UUID: 8682157d-e75b-441e-9f6e-0b3c00610f55
Set UUID: 0b9c134e-cd04-41f1-8f3f-4a783c716707
version: 1
block_size: 1
data_offset: 16
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
...
nvme1n1 259:8 0 13.4G 0 disk
└─bcache0 252:0 0 223.6G 0 disk
sda 8:0 0 223.6G 0 disk
└─bcache0 252:0 0 223.6G 0 disk
方式二:分步創建
- 創建後備存儲盤
$ sudo make-bcache -B /dev/sda
UUID: 8682157d-e75b-441e-9f6e-0b3c00610f55
Set UUID: 0b9c134e-cd04-41f1-8f3f-4a783c716707
version: 1
block_size: 1
data_offset: 16
$ sudo bcache-super-show /dev/sda
sb.magic ok
sb.first_sector 8 [match]
sb.csum 896121C60502BF51 [match]
sb.version 1 [backing device]
dev.label (empty)
dev.uuid 8682157d-e75b-441e-9f6e-0b3c00610f55
dev.sectors_per_block 1
dev.sectors_per_bucket 1024
dev.data.first_sector 16
dev.data.cache_mode 1 [writeback]
dev.data.cache_state 2 [dirty]
cset.uuid 0b9c134e-cd04-41f1-8f3f-4a783c716707
- 創建加速盤
$ sudo make-bcache -C /dev/nvme1n1
UUID: bb156414-f6a2-49ed-a91b-90b7b55f4d10
Set UUID: 0b9c134e-cd04-41f1-8f3f-4a783c716707
version: 0
nbuckets: 102400
block_size: 1
bucket_size: 1024
nr_in_set: 1
nr_this_dev: 0
first_bucket: 1
$ sudo bcache-super-show /dev/nvme1n1
sb.magic ok
sb.first_sector 8 [match]
sb.csum 4C4CD30F3808C062 [match]
sb.version 3 [cache device]
dev.label (empty)
dev.uuid bb156414-f6a2-49ed-a91b-90b7b55f4d10
dev.sectors_per_block 1
dev.sectors_per_bucket 1024
dev.cache.first_sector 1024
dev.cache.cache_sectors 28130304
dev.cache.total_sectors 28131328
dev.cache.ordered yes
dev.cache.discard no
dev.cache.pos 0
dev.cache.replacement 0 [lru]
cset.uuid 0b9c134e-cd04-41f1-8f3f-4a783c716707
- 綁定加速盤和後備存儲盤
其中串碼是cset.uuid
,加速盤和後備盤cset.uuid
一樣
# echo "0b9c134e-cd04-41f1-8f3f-4a783c716707" > /sys/block/bcache0/bcache/attach
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
...
nvme1n1 259:8 0 13.4G 0 disk
└─bcache0 252:0 0 223.6G 0 disk
sda 8:0 0 223.6G 0 disk
└─bcache0 252:0 0 223.6G 0 disk
遇到的問題
- 報錯:設備存在
non-bcache superblock
錯誤
$ sudo make-bcache -C /dev/nvme1n1 -B /dev/sda
Device /dev/nvme1n1 already has a non-bcache superblock, remove it using wipefs and wipefs -a
解決:按照提示,擦除超級塊部分即可
$ sudo wipefs -a /dev/nvme1n1
/dev/nvme1n1: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/nvme1n1: 8 bytes were erased at offset 0x35a7ffe00 (gpt): 45 46 49 20 50 41 52 54
/dev/nvme1n1: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/nvme1n1: calling ioctl to re-read partition table: Success
- 報錯:
Device or resource busy
$ sudo make-bcache -C /dev/nvme1n1 -B /dev/sda
Can't open dev /dev/nvme1n1: Device or resource busy
重啓系統再嘗試
使用
可以將bcache0
作爲一個正常設備使用
$ sudo mkfs.ext4 /dev/bcache0
$ sudo mount /dev/bcache0 /mnt/
$ df -T -h
Filesystem Type Size Used Avail Use% Mounted on
...
/dev/bcache0 ext4 220G 60M 209G 1% /test
$ sudo umount /mnt/
查看相關信息
- state
# cat /sys/block/bcache0/bcache/state
clean
state的幾個狀態:
- no cache:該backing device沒有attach任何caching device
- clean:一切正常,緩存是乾淨的
- dirty:一切正常,已啓用回寫,緩存是髒的
- inconsistent:遇到問題,後臺設備與緩存設備不同步
- 緩存數據量
# cat /sys/block/bcache0/bcache/dirty_data
0.0k
- writeback信息
# cat /sys/block/bcache0/bcache/writeback_
writeback_delay writeback_percent writeback_rate_debug writeback_rate_p_term_inverse writeback_running
writeback_metadata writeback_rate writeback_rate_d_term writeback_rate_update_seconds
更改緩存模式
bcache支持三種緩存模式:
- Writeback : 寫入時先寫到Cache中,同時將對應block的元數據dirty bit,但是並不會立即寫入後備存儲器
- Writethrough : 寫入時將數據同時寫入cache和後備存儲器,後備存儲器寫完,纔算寫操作完成
- Writearound : 寫入的時候,繞過Cache,直接寫入後備存儲器,即加速盤只當讀緩存
下面這張圖可以形象說明三者區別:
- 查看緩存模式
bcache默認的緩存模式是writethrough
$ cat /sys/block/bcache0/bcache/cache_mode
[writethrough] writeback writearound none
- 更改緩存策略
bcache緩存策略比較靈活,可以隨時修改,需要以root身份修改
# echo writearound > /sys/block/bcache0/bcache/cache_mode
$ cat /sys/block/bcache0/bcache/cache_mode
writethrough writeback [writearound] none
解綁和刪除
注意:解綁之前先把混合設備數據轉移防止丟失!
通過解綁加速盤和後備盤的綁定,使設備回到使用bcache之前的狀態
- 解除加速盤和後備盤映射關係
要將加速盤從當前的後備盤刪除,只需將cset.uuid
detach
到bcache設備即可實現
先查看cset.uuid
,加速盤和後備盤設備cset.uuid
是一致的
$ sudo umount /test
$ sudo bcache-super-show /dev/nvme1n1
sb.magic ok
sb.first_sector 8 [match]
sb.csum 4C4CD30F3808C062 [match]
sb.version 3 [cache device]
dev.label (empty)
dev.uuid bb156414-f6a2-49ed-a91b-90b7b55f4d10
dev.sectors_per_block 1
dev.sectors_per_bucket 1024
dev.cache.first_sector 1024
dev.cache.cache_sectors 28130304
dev.cache.total_sectors 28131328
dev.cache.ordered yes
dev.cache.discard no
dev.cache.pos 0
dev.cache.replacement 0 [lru]
cset.uuid 0b9c134e-cd04-41f1-8f3f-4a783c716707
然後解除映射:
# echo "0b9c134e-cd04-41f1-8f3f-4a783c716707" > /sys/block/bcache0/bcache/detach
$ cat /sys/block/sda/bcache/state
no cache
這裏要注意,在註銷加速盤前,要確認該加速設備沒有被任何的後備盤使用,否則可能會有數據丟失的風險。
- 刪除加速盤
通過加速盤的cset.uuid
,在/sys/fs/bcache/<cset.uuid>/unregister
寫入1(echo的數字不重要,可爲任何值),即可進行註銷操作
echo 1 >/sys/fs/bcache/0b9c134e-cd04-41f1-8f3f-4a783c716707/unregister
然後ls
查看/sys/fs/bcache/
,如果沒有0b9c134e-cd04-41f1-8f3f-4a783c716707
這個目錄,就表示註銷成功了。
ls /sys/fs/bcache/
- 刪除後備盤
echo 1 > /sys/block/sda/bcache/stop
參考:
http://www.yangguanjun.com/2018/03/26/lvm-sata-ssd-bcache/
https://ypdai.github.io/2018/07/13/bcache%E9%85%8D%E7%BD%AE%E4%BD%BF%E7%94%A8/