Ceph Crush Map 规则剖析

Ceph Crush Map规则剖析

bucket type 类型

bucket 层次结构中内部节点的 map 术语:主机、机架、行等。CRUSH 映射定义了一系列用于描述这些节点的类型。默认情况下,这些类型包括:

  • osd (or device)
  • host 主机
  • chassis
  • rack 机架
  • row
  • pdu
  • pod
  • room 机房
  • datacenter 数据中心
  • region 亚洲,欧洲
  • root

Ceph Crush Map 规则剖析

查看本地 Crush 层级结构的简单视图

ceph osd crush tree
ceph osd crush dump
ceph osd pool get <pool_name> crush_rule

可以看到为群集定义的规则:

ceph osd crush rule ls

您可以查看规则的内容:

ceph osd crush rule dump

命令定制 crush map

# 创建 ssd root bucket
[root@node1 ~]# ceph osd crush add-bucket ssd root

# 创建 节点 host bucket
[root@node1 ~]# ceph osd crush add-bucket node1-ssd host
[root@node1 ~]# ceph osd crush add-bucket node2-ssd host
[root@node1 ~]# ceph osd crush add-bucket node3-ssd host

# 将节点host bucket 指定 root bucket
[root@node1 ~]# ceph osd crush move node1-ssd root=ssd
[root@node1 ~]# ceph osd crush move node2-ssd root=ssd
[root@node1 ~]# ceph osd crush move node3-ssd root=ssd

[root@node1 ~]# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF
 -9             0 root ssd
-10             0     host node1-ssd
-11             0     host node2-ssd
-12             0     host node3-ssd
 -1       0.14635 root default
 -3       0.04878     host node1
  0   hdd 0.01949         osd.0          up  1.00000 1.00000
  3   hdd 0.02930         osd.3          up  1.00000 1.00000
 -5       0.04878     host node2
  1   hdd 0.01949         osd.1          up  1.00000 1.00000
  4   hdd 0.02930         osd.4          up  1.00000 1.00000
 -7       0.04878     host node3
  2   hdd 0.01949         osd.2          up  1.00000 1.00000
  5   hdd 0.02930         osd.5          up  1.00000 1.00000

# 将已有的 osd 节点 添加到创建的 crush map
[root@node1 ~]# ceph osd crush move osd.3 host=node1-ssd root=ssd
[root@node1 ~]# ceph osd crush move osd.4 host=node2-ssd root=ssd
[root@node1 ~]# ceph osd crush move osd.5 host=node3-ssd root=ssd
[root@node1 ~]# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF
 -9       0.08789 root ssd
-10       0.02930     host node1-ssd
  3   hdd 0.02930         osd.3          up  1.00000 1.00000
-11       0.02930     host node2-ssd
  4   hdd 0.02930         osd.4          up  1.00000 1.00000
-12       0.02930     host node3-ssd
  5   hdd 0.02930         osd.5          up  1.00000 1.00000
 -1       0.05846 root default
 -3       0.01949     host node1
  0   hdd 0.01949         osd.0          up  1.00000 1.00000
 -5       0.01949     host node2
  1   hdd 0.01949         osd.1          up  1.00000 1.00000
 -7       0.01949     host node3
  2   hdd 0.01949         osd.2          up  1.00000 1.00000

# 创建 crush rule
[root@node1 ~]# ceph osd crush rule create-replicated
Invalid command: missing required parameter name(<string(goodchars [A-Za-z0-9-_.])>)
osd crush rule create-replicated <name> <root> <type> {<class>} :  create crush rule <name> for replicated pool to start from <root>, replicate across buckets of type <type>, use devices of type <class> (ssd or hdd)
Error EINVAL: invalid command
[root@node1 ~]# ceph osd crush rule create-replicated ssd-demo ssd host hdd

# 查看创建的 crush 规则
[root@node1 ~]# ceph osd crush rule dump
[root@node1 ~]# ceph osd crush rule ls
replicated_rule
ssd-demo

# 修改资源池的 crush 规则
[root@node1 ~]# ceph osd pool set pool_demo crush_rule ssd-demo
set pool 9 crush_rule to ssd-demo
[root@node1 ~]# ceph osd pool get pool_demo crush_rule
crush_rule: ssd-demo

# 验证存放节点是否和设置的一样
[root@node1 ~]# rbd create pool_demo/demo.img --size 5G
[root@node1 ~]# rbd -p pool_demo ls
demo.img
[root@node1 ~]# ceph osd map pool_demo demo.img
osdmap e183 pool 'pool_demo' (9) object 'demo.img' -> pg 9.c1a6751d (9.d) -> up ([3,4,5], p3) acting ([3,4,5], p3)

注意事项

  • 反编译二进制文件创建的时候需要保留初始.bin文件,有问题方便恢复
  • 尽量在集群搭建之初就设计好,否则有数据在进行修改的话会进行大量的迁移,影响性能
  • 手动修改后,重启 ceph-osd 会恢复成设计之初,osd 节点crush map 会失效,需要修改参数 osd crush update on start = false
[root@node1 my-cluster]# cat ceph.conf # [osd] 内容
[global]
fsid = 3f5560c6-3af3-4983-89ec-924e8eaa9e06
public_network = 192.168.6.0/24
cluster_network = 172.16.79.0/16
mon_initial_members = node1
mon_host = 192.168.6.160
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
mon_allow_pool_delete = true

[client.rgw.node1]
rgw_frontends = "civetweb port=80"

[osd]
osd crush update on start = false

[root@node1 my-cluster]# ceph-deploy --over-write config push node1 node2 node3
[root@node1 my-cluster]# systemctl restart ceph-osd.target
[root@node1 my-cluster]# ssh node2 systemctl restart ceph-osd.target
[root@node1 my-cluster]# ssh node3 systemctl restart ceph-osd.target
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章