1. 程式人生 > 其它 >DAY.1-Ceph元件、資料讀寫流程、叢集搭建及RBD使用

DAY.1-Ceph元件、資料讀寫流程、叢集搭建及RBD使用

一、Ceph元件:

  1.OSD(Object Storage Daemon)

  功能:Ceph OSDs(物件儲存守護程式ceph-osd):提供資料儲存,作業系統上的一個磁碟就是一個OSD守護程式,用於處理ceph叢集資料複製、回覆、重新平衡,並通過檢查其他Ceph OSD守護程式的心跳來向Ceph監視器和管理器提供一些監視資訊,實現冗餘和高可用性至少需要3個Ceph OSD。

  2.Mon (monitor):ceph的監視器

  功能:一個主機上執行的一個守護程序,用於維護叢集狀態對映(maintains maps of the cluster state),如ceph叢集中有多少儲存池、每個儲存池有多少個PG以及儲存池的PG的對映關係等,一個ceph叢集至少有一個Mon(1,3,5,7...),Ceph守護程式相互協調所需的關鍵叢集狀態有:monitor map,manager map,the OSD map,the MDS map和the CRUSH map。

  3.Mgr(Manager)管理器

  功能:一個主機上執行的一個守護程序,Ceph Manager守護程式負責跟蹤執行時,指標和Ceph叢集的當前狀態,包括儲存利用率,當前效能指標和系統負載。還託管基於Python的模組來管理和公開Ceph叢集資訊,包括基於Web的Ceph儀表板和REST API。高可用至少需要兩個管理器。

二、Ceph的資料讀寫流程:

  • 計算檔案到物件的對映,得到oid(object id)= ino+non:
    • ino:iNode number (INO),File的元資料序列號,File的唯一id
    • ono:object number (ONO),File切分產生的某個object的序號,預設以4M切分一個塊大小
  • 通過hash演算法計算出檔案對應的pool中的PG:

   通過一致性HASH計算object到PG,Object --> PG對映的hash(oid)&mask --> pgid

  • 通過CRUSH把物件對映到PG中的OSD

   通過CRUSH演算法計算PG到OSD,PG --> OSD對映:[CRUSH(pgid)->(osd1,osd2,osd3)]

  • PG中的主OSD將物件寫入到硬碟
  • 主OSD將資料同步到備份OSD,並等待備份OSD返回確認
  • 主OSD將寫入完成返回給客戶端。
  說明:

  Pool:儲存池、分割槽,儲存池的大小取決於底層的儲存空間。

  PG(placement group):一個pool內部可以有多個PG存在,Pool和PG都是抽象的邏輯概念,一個pool中有多少個PG可以通過公式計算。

  OSD(Object storage Daemon,物件儲存裝置):每一塊磁碟都是一個osd,一個主機由一個或多個osd組成。

  ceph叢集部署好之後,要先建立儲存池才能向ceph寫入資料,檔案在向ceph儲存之前要先進行一致性hash計算,計算後會把檔案儲存在某個對應的PG中,此檔案一定屬於某個pool的一個PG,在通過PG儲存在OSD上。資料物件在寫到主OSD之後再同步到從OSD以實現資料的高可用。

三、部署ceph叢集

伺服器角色 系統版本 IP地址 基本配置及分割槽大小
ceph-deploy Ubuntu 1804 10.0.0.100/192.168.0.100 2C2G/120G
ceph-mon1 Ubuntu 1804 10.0.0.101/192.168.0.101 2C2G/120G
ceph-mon2 Ubuntu 1804 10.0.0.102/192.168.0.102 2C2G/120G
ceph-mon3 Ubuntu 1804 10.0.0.103/192.168.0.103 2C2G/120G
ceph-mgr1 Ubuntu 1804 10.0.0.104/192.168.0.104 2C2G/120G
ceph-mgr2 Ubuntu 1804 10.0.0.105/192.168.0.105 2C2G/120G
ceph-node1 Ubuntu 1804 10.0.0.106/192.168.0.106 2C2G/120G+100*5
ceph-node2 Ubuntu 1804 10.0.0.107/192.168.0.107 2C2G/120G+100*5
ceph-node3 Ubuntu 1804 10.0.0.108/192.168.0.108 2C2G/120G+100*5
ceph-node4 Ubuntu 1804 10.0.0.109/192.168.0.109 2C2G/120G+100*5

環境簡介:

  1、一個伺服器用語部署ceph叢集即安裝ceph-deploy,也可以和cepy-mgr等複用。

10.0.0.100/192.168.0.100

  2、三臺伺服器作為ceph叢集Mon監控伺服器,每臺伺服器可以和ceph叢集的cluster網路通訊

10.0.0.101/192.168.0.101
10.0.0.102/192.168.0.102
10.0.0.103/192.168.0.103

  3、兩個ceph-mgr管理器,可以和ceph叢集的cluster網路通訊

10.0.0.104/192.168.0.104
10.0.0.105/192.168.0.105

  4、四臺伺服器作為ceph叢集OSD儲存伺服器,每臺伺服器支援兩個網路,public網路針對客戶端使用,cluster網路用於叢集管理及資料同步,每臺3塊以上硬碟

10.0.0.106/192.168.0.106
10.0.0.107/192.168.0.107
10.0.0.108/192.168.0.108
10.0.0.109/192.168.0.109

#各儲存伺服器磁碟劃分:
/dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf #100G

  5、建立一個普通使用者,能夠通過sudo執行特權命令,配置主機名解析,ceph叢集部署過程中需要對各主機配置不同的主機名,另外如果是centos系統則需要關閉各伺服器的防火牆和selinux。

  • Ubuntu server系統基礎配置

  1、更改主機名

# cat /etc/hostname 
Ubuntu1804
# hostnamectl set-hostname ceph-deploy.example.lcoal  #hostnamectl set-hostname 更改後的主機名
# cat /etc/hostname 
ceph-deploy.example.lcoal

  2、更改網絡卡名稱為eth*

  方法一:安裝Ubuntu系統介面時傳遞核心引數:net.ifnames=0 biosdevname=0

  

  方法二:如果沒有在安裝系統之前傳遞核心引數將網絡卡名稱更改為eth*,可以通過如下方式更改(需重啟Ubuntu系統):

  

  3、配置root遠端登入

  預設情況下,Ubuntu不允許root使用者遠端ssh,需新增root密碼並編輯/etc/ssh/sshd_config檔案:

$ sudo vim /etc/ssh/sshd_config
32 #PermitRootLogin prohibit-password
33 PermitRootLogin yes        #允許root登入
101 #UseDNS no
102 UseDNS no            #關閉DNS解析
$ sudo su - root          #切換至root使用者 # passwd             #設定root密碼 Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully # systemctl restart sshd     #重啟ssh服務

  4、各節點配置伺服器網絡卡,例如ceph-deploy:

root@ceph-deploy:~# cat /etc/netplan/01-netcfg.yaml
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
  version: 2
  renderer: networkd
  ethernets:
    eth0:
      dhcp4: no
      dhcp6: no
      addresses: [10.0.0.100/24]
      gateway4: 10.0.0.2
      nameservers:
              addresses: [10.0.0.2, 114.114.114.114, 8.8.8.8]
    eth1:
      dhcp4: no
      dhcp6: no
      addresses: [192.168.0.100/24]
root@ceph-deploy:~# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.0.100  netmask 255.255.255.0  broadcast 10.0.0.255
        inet6 fe80::20c:29ff:fe65:a300  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:65:a3:00  txqueuelen 1000  (Ethernet)
        RX packets 2057  bytes 172838 (172.8 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1575  bytes 221983 (221.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.0.100  netmask 255.255.255.0  broadcast 192.168.0.255
        inet6 fe80::20c:29ff:fe65:a30a  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:65:a3:0a  txqueuelen 1000  (Ethernet)
        RX packets 2  bytes 486 (486.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 14  bytes 1076 (1.0 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 182  bytes 14992 (14.9 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 182  bytes 14992 (14.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@ceph-deploy:~# ping -c 1 -i 1 www.baidu.com 
64 bytes from 220.181.38.149 (220.181.38.149): icmp_seq=1 ttl=128 time=6.67 ms

  5、配置apt倉庫

  https://mirrors.aliyun.com/ceph/  #阿里雲映象倉庫

  http://mirrors.163.com/ceph/   #網易映象倉庫

  https://mirrors.tuna.tsinghua.edu.cn/ceph/    #清華大學映象源倉庫

$ wget -q -O- 'https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc' | sudo apt-key add -    #匯入key檔案
OK
$ sudo echo "deb https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-pacific bionic main" >> /etc/apt/sources.list
$ cat /etc/apt/sources.list
# 預設註釋了原始碼映象以提高 apt update 速度,如有需要可自行取消註釋
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic main     #echo追加該條內容

$ sudo apt update
  • 部署RADOS叢集

  1、建立cephadmin使用者:

  推薦使用指定的普通使用者部署、執行ceph叢集,普通使用者只要能以互動式方式使用sudo命令執行一些特權命令即可,新版的ceph-deploy可以指定包含root在內的只要可以執行sudo命令的使用者,不過仍推薦使用普通使用者,如:cephuser、cephadmin這樣的使用者去管理ceph叢集。

  在包含ceph-deploy節點的儲存節點、Mon節點和mgr節點建立cephadmin使用者。

groupadd -r -g 2022 cephadmin && useradd -r -m -s /bin/bash -u 2022 -g 2022 cephadmin && echo cephadmin:123.com | chpasswd

  各服務允許cephadmin使用者使用sudo執行特權命令:

echo "cephadmin ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers

  2、配置免祕鑰登入:

  在ceph-deploy節點配置允許以非互動的方式登入到各ceph node/mon/mgr節點,及在ceph-deploy節點生成金鑰對,然後分發公鑰到各被管理節點:

cephadmin@ceph-deploy:~$ ssh-keygen         #生成ssh祕鑰對
Generating public/private rsa key pair.
Enter file in which to save the key (/home/cephadmin/.ssh/id_rsa): 
Created directory '/home/cephadmin/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/cephadmin/.ssh/id_rsa.
Your public key has been saved in /home/cephadmin/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:0+vL5tnFkcEzFiGCmKTzR7G58KHrbUB9qBiaqtYsSi4 cephadmin@ceph-deploy
The key's randomart image is:
+---[RSA 2048]----+
|     ..o... . o. |
|     .o .+ . o . |
|    o ..=.    *  |
|    .o.=o+.  . = |
|   o +o.S..   o  |
|  o . oo . . . . |
| oo   ..  .   o  |
|Eo o . ..o.o .   |
|B..   ...o*..    |
+----[SHA256]-----+
cephadmin@ceph-deploy:~$ ssh-copy-id [email protected]    #分發公鑰至各管理節點(包括自身)
cephadmin@ceph-deploy:~$ ssh-copy-id [email protected]
cephadmin@ceph-deploy:~$ ssh-copy-id [email protected]
cephadmin@ceph-deploy:~$ ssh-copy-id [email protected]
cephadmin@ceph-deploy:~$ ssh-copy-id [email protected]
cephadmin@ceph-deploy:~$ ssh-copy-id [email protected]
cephadmin@ceph-deploy:~$ ssh-copy-id [email protected]
cephadmin@ceph-deploy:~$ ssh-copy-id [email protected]
cephadmin@ceph-deploy:~$ ssh-copy-id [email protected]
cephadmin@ceph-deploy:~$ ssh-copy-id [email protected]

  3、各節點配置域名解析:

# cat >> /etc/hosts << EOF
10.0.0.100 ceph-deploy
10.0.0.101 ceph-mon1
10.0.0.102 ceph-mon2
10.0.0.103 ceph-mon3
10.0.0.104 ceph-mgr1
10.0.0.105 ceph-mgr2
10.0.0.106 ceph-node1
10.0.0.107 ceph-node2
10.0.0.108 ceph-node3
10.0.0.109 ceph-node4
EOF

  4、在各節點安裝Python2:

# apt -y install python2.7              #安裝Python2.7
# ln -sv /usr/bin/python2.7 /usr/bin/python2    #建立軟連結

  5、安裝ceph部署工具

  在ceph部署伺服器安裝部署工具ceph-deploy

cephadmin@ceph-deploy:~$ apt-cache madison ceph-deploy 
ceph-deploy |      2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic/main amd64 Packages
ceph-deploy |      2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic/main i386 Packages
ceph-deploy | 1.5.38-0ubuntu1 | https://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe amd64 Packages
ceph-deploy | 1.5.38-0ubuntu1 | https://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe i386 Packages
cephadmin@ceph-deploy:~$ sudo apt -y install ceph-deploy

  6、初始化Mon節點

  在ceph-deploy管理節點初始化Mon節點,Mon節點也需要有cluster network,否則初始化會報錯;

cephadmin@ceph-deploy:~$ mkdir ceph-cluster
cephadmin@ceph-deploy:~$ cd ceph-cluster/
cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy new --cluster-network 192.168.0.0/24 --public-network 10.0.0.0/24 ceph-mon1

  驗證初始化:

cephadmin@ceph-deploy:~/ceph-cluster$ ll
total 20
drwxrwxr-x 2 cephadmin cephadmin 4096 Aug 18 15:26 ./
drwxr-xr-x 6 cephadmin cephadmin 4096 Aug 18 15:20 ../
-rw-rw-r-- 1 cephadmin cephadmin  259 Aug 18 15:26 ceph.conf                   #自動生成的配置檔案    
-rw-rw-r-- 1 cephadmin cephadmin 3892 Aug 18 15:26 ceph-deploy-ceph.log        #初始化日誌
-rw------- 1 cephadmin cephadmin   73 Aug 18 15:26 ceph.mon.keyring            #用於ceph Mon節點內部通訊認證的祕鑰環檔案
cephadmin@ceph-deploy:~/ceph-cluster$ cat ceph.conf 
[global]
fsid = 0d11d338-a480-40da-8520-830423b22c3e                                    #ceph的叢集ID
public_network = 10.0.0.0/24
cluster_network = 192.168.0.0/24
mon_initial_members = ceph-mon1                                                #可以用逗號做分割新增多個mon節點
mon_host = 10.0.0.101
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

  配置Mon節點並生成同步祕鑰

  在各Mon節點安裝元件ceph-mon,並初始化Mon節點,Mon節點可以後期橫向擴容

root@ceph-mon1:~# apt -y install ceph-mon

cephadmin@ceph-deploy:~/ceph-cluster$ pwd
/home/cephadmin/ceph-cluster
cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mon create-initial
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy mon create-initial
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create-initial
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f0903be4fa0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function mon at 0x7f0903bc8ad0>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  keyrings                      : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-mon1
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-mon1 ...
[ceph-mon1][DEBUG ] connection detected need for sudo
[ceph-mon1][DEBUG ] connected to host: ceph-mon1 
[ceph-mon1][DEBUG ] detect platform information from remote host
[ceph-mon1][DEBUG ] detect machine type
[ceph-mon1][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 18.04 bionic
[ceph-mon1][DEBUG ] determining if provided host has same hostname in remote
[ceph-mon1][DEBUG ] get remote short hostname
[ceph-mon1][DEBUG ] deploying mon to ceph-mon1
[ceph-mon1][DEBUG ] get remote short hostname
[ceph-mon1][DEBUG ] remote hostname: ceph-mon1
[ceph-mon1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-mon1][DEBUG ] create the mon path if it does not exist
[ceph-mon1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-mon1/done
[ceph-mon1][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-mon1/done
[ceph-mon1][INFO  ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring
[ceph-mon1][DEBUG ] create the monitor keyring file
[ceph-mon1][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs -i ceph-mon1 --keyring /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring --setuser 64045 --setgroup 64045
[ceph-mon1][INFO  ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring
[ceph-mon1][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph-mon1][DEBUG ] create the init path if it does not exist
[ceph-mon1][INFO  ] Running command: sudo systemctl enable ceph.target
[ceph-mon1][INFO  ] Running command: sudo systemctl enable ceph-mon@ceph-mon1
[ceph-mon1][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/[email protected] → /lib/systemd/system/ceph-mon@.service.
[ceph-mon1][INFO  ] Running command: sudo systemctl start ceph-mon@ceph-mon1
[ceph-mon1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status
[ceph-mon1][DEBUG ] ********************************************************************************
[ceph-mon1][DEBUG ] status for monitor: mon.ceph-mon1
[ceph-mon1][DEBUG ] {
[ceph-mon1][DEBUG ]   "election_epoch": 3, 
[ceph-mon1][DEBUG ]   "extra_probe_peers": [], 
[ceph-mon1][DEBUG ]   "feature_map": {
[ceph-mon1][DEBUG ]     "mon": [
[ceph-mon1][DEBUG ]       {
[ceph-mon1][DEBUG ]         "features": "0x3f01cfb9fffdffff", 
[ceph-mon1][DEBUG ]         "num": 1, 
[ceph-mon1][DEBUG ]         "release": "luminous"
[ceph-mon1][DEBUG ]       }
[ceph-mon1][DEBUG ]     ]
[ceph-mon1][DEBUG ]   }, 
[ceph-mon1][DEBUG ]   "features": {
[ceph-mon1][DEBUG ]     "quorum_con": "4540138297136906239", 
[ceph-mon1][DEBUG ]     "quorum_mon": [
[ceph-mon1][DEBUG ]       "kraken", 
[ceph-mon1][DEBUG ]       "luminous", 
[ceph-mon1][DEBUG ]       "mimic", 
[ceph-mon1][DEBUG ]       "osdmap-prune", 
[ceph-mon1][DEBUG ]       "nautilus", 
[ceph-mon1][DEBUG ]       "octopus", 
[ceph-mon1][DEBUG ]       "pacific", 
[ceph-mon1][DEBUG ]       "elector-pinging"
[ceph-mon1][DEBUG ]     ], 
[ceph-mon1][DEBUG ]     "required_con": "2449958747317026820", 
[ceph-mon1][DEBUG ]     "required_mon": [
[ceph-mon1][DEBUG ]       "kraken", 
[ceph-mon1][DEBUG ]       "luminous", 
[ceph-mon1][DEBUG ]       "mimic", 
[ceph-mon1][DEBUG ]       "osdmap-prune", 
[ceph-mon1][DEBUG ]       "nautilus", 
[ceph-mon1][DEBUG ]       "octopus", 
[ceph-mon1][DEBUG ]       "pacific", 
[ceph-mon1][DEBUG ]       "elector-pinging"
[ceph-mon1][DEBUG ]     ]
[ceph-mon1][DEBUG ]   }, 
[ceph-mon1][DEBUG ]   "monmap": {
[ceph-mon1][DEBUG ]     "created": "2021-08-18T07:55:40.349602Z", 
[ceph-mon1][DEBUG ]     "disallowed_leaders: ": "", 
[ceph-mon1][DEBUG ]     "election_strategy": 1, 
[ceph-mon1][DEBUG ]     "epoch": 1, 
[ceph-mon1][DEBUG ]     "features": {
[ceph-mon1][DEBUG ]       "optional": [], 
[ceph-mon1][DEBUG ]       "persistent": [
[ceph-mon1][DEBUG ]         "kraken", 
[ceph-mon1][DEBUG ]         "luminous", 
[ceph-mon1][DEBUG ]         "mimic", 
[ceph-mon1][DEBUG ]         "osdmap-prune", 
[ceph-mon1][DEBUG ]         "nautilus", 
[ceph-mon1][DEBUG ]         "octopus", 
[ceph-mon1][DEBUG ]         "pacific", 
[ceph-mon1][DEBUG ]         "elector-pinging"
[ceph-mon1][DEBUG ]       ]
[ceph-mon1][DEBUG ]     }, 
[ceph-mon1][DEBUG ]     "fsid": "0d11d338-a480-40da-8520-830423b22c3e", 
[ceph-mon1][DEBUG ]     "min_mon_release": 16, 
[ceph-mon1][DEBUG ]     "min_mon_release_name": "pacific", 
[ceph-mon1][DEBUG ]     "modified": "2021-08-18T07:55:40.349602Z", 
[ceph-mon1][DEBUG ]     "mons": [
[ceph-mon1][DEBUG ]       {
[ceph-mon1][DEBUG ]         "addr": "10.0.0.101:6789/0", 
[ceph-mon1][DEBUG ]         "crush_location": "{}", 
[ceph-mon1][DEBUG ]         "name": "ceph-mon1", 
[ceph-mon1][DEBUG ]         "priority": 0, 
[ceph-mon1][DEBUG ]         "public_addr": "10.0.0.101:6789/0", 
[ceph-mon1][DEBUG ]         "public_addrs": {
[ceph-mon1][DEBUG ]           "addrvec": [
[ceph-mon1][DEBUG ]             {
[ceph-mon1][DEBUG ]               "addr": "10.0.0.101:3300", 
[ceph-mon1][DEBUG ]               "nonce": 0, 
[ceph-mon1][DEBUG ]               "type": "v2"
[ceph-mon1][DEBUG ]             }, 
[ceph-mon1][DEBUG ]             {
[ceph-mon1][DEBUG ]               "addr": "10.0.0.101:6789", 
[ceph-mon1][DEBUG ]               "nonce": 0, 
[ceph-mon1][DEBUG ]               "type": "v1"
[ceph-mon1][DEBUG ]             }
[ceph-mon1][DEBUG ]           ]
[ceph-mon1][DEBUG ]         }, 
[ceph-mon1][DEBUG ]         "rank": 0, 
[ceph-mon1][DEBUG ]         "weight": 0
[ceph-mon1][DEBUG ]       }
[ceph-mon1][DEBUG ]     ], 
[ceph-mon1][DEBUG ]     "stretch_mode": false
[ceph-mon1][DEBUG ]   }, 
[ceph-mon1][DEBUG ]   "name": "ceph-mon1", 
[ceph-mon1][DEBUG ]   "outside_quorum": [], 
[ceph-mon1][DEBUG ]   "quorum": [
[ceph-mon1][DEBUG ]     0
[ceph-mon1][DEBUG ]   ], 
[ceph-mon1][DEBUG ]   "quorum_age": 1, 
[ceph-mon1][DEBUG ]   "rank": 0, 
[ceph-mon1][DEBUG ]   "state": "leader", 
[ceph-mon1][DEBUG ]   "stretch_mode": false, 
[ceph-mon1][DEBUG ]   "sync_provider": []
[ceph-mon1][DEBUG ] }
[ceph-mon1][DEBUG ] ********************************************************************************
[ceph-mon1][INFO  ] monitor: mon.ceph-mon1 is running
[ceph-mon1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status
[ceph_deploy.mon][INFO  ] processing monitor mon.ceph-mon1
[ceph-mon1][DEBUG ] connection detected need for sudo
[ceph-mon1][DEBUG ] connected to host: ceph-mon1 
[ceph-mon1][DEBUG ] detect platform information from remote host
[ceph-mon1][DEBUG ] detect machine type
[ceph-mon1][DEBUG ] find the location of an executable
[ceph-mon1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status
[ceph_deploy.mon][INFO  ] mon.ceph-mon1 monitor has reached quorum!
[ceph_deploy.mon][INFO  ] all initial monitors are running and have formed quorum
[ceph_deploy.mon][INFO  ] Running gatherkeys...
[ceph_deploy.gatherkeys][INFO  ] Storing keys in temp directory /tmp/tmpqCeuN6
[ceph-mon1][DEBUG ] connection detected need for sudo
[ceph-mon1][DEBUG ] connected to host: ceph-mon1 
[ceph-mon1][DEBUG ] detect platform information from remote host
[ceph-mon1][DEBUG ] detect machine type
[ceph-mon1][DEBUG ] get remote short hostname
[ceph-mon1][DEBUG ] fetch remote file
[ceph-mon1][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph-mon1.asok mon_status
[ceph-mon1][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.admin
[ceph-mon1][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-mds
[ceph-mon1][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-mgr
[ceph-mon1][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-osd
[ceph-mon1][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-rgw
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.client.admin.keyring
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-mds.keyring
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-mgr.keyring
[ceph_deploy.gatherkeys][INFO  ] keyring 'ceph.mon.keyring' already exists
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-osd.keyring
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-rgw.keyring
[ceph_deploy.gatherkeys][INFO  ] Destroy temp directory /tmp/tmpqCeuN6

  驗證Mon節點

  驗證在mon節點已經安裝並啟動了ceph-mon服務,並且後期在ceph-deploy節點初始化目錄會生成一些bootstrap ceph mds/mgr/osd/rgw等服務的keyring認證檔案,這些初始化檔案擁有對ceph叢集的最高特權,所以一定要儲存好。

root@ceph-mon1:~# ps -ef | grep ceph-mon
ceph       6688      1  0 15:55 ?        00:00:00 /usr/bin/ceph-mon -f --cluster ceph --id ceph-mon1 --setuser ceph --setgroup ceph
root       7252   2514  0 16:00 pts/0    00:00:00 grep --color=auto ceph-mon

  7、配置manager節點

    部署ceph-mgr節點:

   mgr節點需要讀取ceph的配置檔案,及/etc/ceph目錄中的配置檔案

#初始化ceph-mgr節點:
root@ceph-mgr1:~# apt -y install ceph-mgr

cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mgr create ceph-mgr1
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy mgr create ceph-mgr1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  mgr                           : [('ceph-mgr1', 'ceph-mgr1')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f8d17024c30>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function mgr at 0x7f8d17484150>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-mgr1:ceph-mgr1
The authenticity of host 'ceph-mgr1 (10.0.0.104)' can't be established.
ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ceph-mgr1' (ECDSA) to the list of known hosts.
[ceph-mgr1][DEBUG ] connection detected need for sudo
[ceph-mgr1][DEBUG ] connected to host: ceph-mgr1 
[ceph-mgr1][DEBUG ] detect platform information from remote host
[ceph-mgr1][DEBUG ] detect machine type
[ceph_deploy.mgr][INFO  ] Distro info: Ubuntu 18.04 bionic
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-mgr1
[ceph-mgr1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-mgr1][WARNIN] mgr keyring does not exist yet, creating one
[ceph-mgr1][DEBUG ] create a keyring file
[ceph-mgr1][DEBUG ] create path recursively if it doesn't exist
[ceph-mgr1][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-mgr1 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-mgr1/keyring
[ceph-mgr1][INFO  ] Running command: sudo systemctl enable ceph-mgr@ceph-mgr1
[ceph-mgr1][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/[email protected] → /lib/systemd/system/ceph-mgr@.service.
[ceph-mgr1][INFO  ] Running command: sudo systemctl start ceph-mgr@ceph-mgr1
[ceph-mgr1][INFO  ] Running command: sudo systemctl enable ceph.target

  驗證ceph-mgr節點:

root@ceph-mgr1:~# ps -ef | grep ceph-mgr
ceph       8128      1  8 17:09 ?        00:00:03 /usr/bin/ceph-mgr -f --cluster ceph --id ceph-mgr1 --setuser ceph --setgroup ceph
root       8326   2396  0 17:10 pts/0    00:00:00 grep --color=auto ceph-mgr

  8、分發admin祕鑰:

  在ceph-deploy節點把配置檔案和admin祕鑰拷貝至ceph叢集需要執行ceph管理命令的節點,從而不需要後期通過ceph命令對ceph叢集進行管理配置的時候每次都需要指定ceph-mon節點地址和ceph.client.admin.keyring檔案,另外各ceph-mon節點也需要同步ceph的叢集配置檔案及認證檔案。

  在ceph-deploy節點管理叢集:

root@ceph-deploy:~# apt -y install ceph-common        #安裝ceph公共元件,安裝ceph-common需要使用root
root@cepn-node1:~# apt -y install ceph-common
root@cepn-node2:~# apt -y install ceph-common
root@cepn-node3:~# apt -y install ceph-common
root@cepn-node4:~# apt -y install ceph-common


cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy admin ceph-deploy ceph-node1 ceph-node2 ceph-node3 ceph-node4    #分發admin祕鑰
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy admin ceph-deploy
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f4ba41a4190>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  client                        : ['ceph-deploy']
[ceph_deploy.cli][INFO  ]  func                          : <function admin at 0x7f4ba4aa5a50>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-deploy
[ceph-deploy][DEBUG ] connection detected need for sudo
[ceph-deploy][DEBUG ] connected to host: ceph-deploy 
[ceph-deploy][DEBUG ] detect platform information from remote host
[ceph-deploy][DEBUG ] detect machine type
[ceph-deploy][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf

[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy admin ceph-node1 ceph-node2 ceph-node3 ceph-node4
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fc78eac3190>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  client                        : ['ceph-node1', 'ceph-node2', 'ceph-node3', 'ceph-node4']
[ceph_deploy.cli][INFO  ]  func                          : <function admin at 0x7fc78f3c4a50>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node1
The authenticity of host 'ceph-node1 (10.0.0.106)' can't be established.
ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ceph-node1' (ECDSA) to the list of known hosts.
[ceph-node1][DEBUG ] connection detected need for sudo
[ceph-node1][DEBUG ] connected to host: ceph-node1 
[ceph-node1][DEBUG ] detect platform information from remote host
[ceph-node1][DEBUG ] detect machine type
[ceph-node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node2
The authenticity of host 'ceph-node2 (10.0.0.107)' can't be established.
ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ceph-node2' (ECDSA) to the list of known hosts.
[ceph-node2][DEBUG ] connection detected need for sudo
[ceph-node2][DEBUG ] connected to host: ceph-node2 
[ceph-node2][DEBUG ] detect platform information from remote host
[ceph-node2][DEBUG ] detect machine type
[ceph-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node3
The authenticity of host 'ceph-node3 (10.0.0.108)' can't be established.
ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ceph-node3' (ECDSA) to the list of known hosts.
[ceph-node3][DEBUG ] connection detected need for sudo
[ceph-node3][DEBUG ] connected to host: ceph-node3 
[ceph-node3][DEBUG ] detect platform information from remote host
[ceph-node3][DEBUG ] detect machine type
[ceph-node3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node4
The authenticity of host 'ceph-node4 (10.0.0.109)' can't be established.
ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ceph-node4' (ECDSA) to the list of known hosts.
[ceph-node4][DEBUG ] connection detected need for sudo
[ceph-node4][DEBUG ] connected to host: ceph-node4 
[ceph-node4][DEBUG ] detect platform information from remote host
[ceph-node4][DEBUG ] detect machine type
[ceph-node4][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf

  ceph節點驗證祕鑰:

  到ceph-node節點驗證key檔案

root@cepn-node1:~# ll /etc/ceph/
total 20
drwxr-xr-x  2 root root 4096 Aug 18 16:38 ./
drwxr-xr-x 91 root root 4096 Aug 18 16:27 ../
-rw-------  1 root root  151 Aug 18 16:38 ceph.client.admin.keyring
-rw-r--r--  1 root root  259 Aug 18 16:38 ceph.conf
-rw-r--r--  1 root root   92 Jul  8 22:17 rbdmap
-rw-------  1 root root    0 Aug 18 16:38 tmp4MDGPp

  認證檔案的屬主和屬組為了安全考慮,預設設定為root使用者和root組,如果需要cephadmin使用者也能執行ceph命令,需要對cephadmin使用者進行授權

cephadmin@ceph-deploy:~/ceph-cluster$ sudo setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring
root@cepn-node1:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring
root@cepn-node2:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring
root@cepn-node3:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring
root@cepn-node4:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring

  測試ceph命令:

cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s
  cluster:
    id:     0d11d338-a480-40da-8520-830423b22c3e
    health: HEALTH_WARN
            mon is allowing insecure global_id reclaim    #需要禁用非安全模式通訊
            OSD count 0 < osd_pool_default_size 3         #叢集的OSD數量小於3
 
  services:
    mon: 1 daemons, quorum ceph-mon1 (age 98m)
    mgr: ceph-mgr1(active, since 24m)
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:     
 
cephadmin@ceph-deploy:~/ceph-cluster$ ceph config set mon auth_allow_insecure_global_id_reclaim false #禁用非安全模式通訊
cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s
  cluster:
    id:     0d11d338-a480-40da-8520-830423b22c3e
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3
 
  services:
    mon: 1 daemons, quorum ceph-mon1 (age 100m)
    mgr: ceph-mgr1(active, since 26m)
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:     
 
cephadmin@ceph-deploy:~/ceph-cluster$

  9、初始化node節點過程

  新增OSD之前,需要對node節點安裝基本環境:

cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy install --no-adjust-repos --nogpgcheck ceph-node1 ceph-node2 ceph-node3 ceph-node4
--no-adjust-repos install packages without modifying source repos
--nogpgcheck install packages without gpgcheck

  擦除硬碟

cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy disk list ceph-node1 ceph-node2 ceph-node3 ceph-node4 #列出遠端儲存node節點的磁碟資訊

  使用ceph-deploy disk zap 擦除各caph node資料磁碟

cephadmin@ceph-deploy:~/ceph-cluster$ cat EraseDisk.sh 
#!/bin/bash
#

for i in {1..4}; do
    for d in {b..f}; do
        ceph-deploy disk zap ceph-node$i /dev/sd$d
    done
done
cephadmin@ceph-deploy:~/ceph-cluster$ bash -n EraseDisk.sh 
cephadmin@ceph-deploy:~/ceph-cluster$ bash EraseDisk.sh

  新增主機的磁碟OSD:

  資料分類儲存方式:

  • Data:即ceph儲存的物件資料
  • Block:rocks DB資料,即元資料
  • block-wal:資料庫的wal日誌

  新增OSD(OSD的ID從0開始順序使用):

cephadmin@ceph-deploy:~/ceph-cluster$ cat CreateDisk.sh 
#!/bin/bash
#
for i in {1..4}; do
    for d in {b..f}; do
        ceph-deploy osd create ceph-node$i --data /dev/sd$d
    done
done
cephadmin@ceph-deploy:~/ceph-cluster$ bash -n CreateDisk.sh 
cephadmin@ceph-deploy:~/ceph-cluster$ bash CreateDisk.sh

  10、驗證ceph叢集:

cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s
  cluster:
    id:     0d11d338-a480-40da-8520-830423b22c3e
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum ceph-mon1 (age 2h)
    mgr: ceph-mgr1(active, since 88m)
    osd: 20 osds: 20 up (since 55s), 20 in (since 63s)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   150 MiB used, 2.0 TiB / 2.0 TiB avail
    pgs:     1 active+clean

  11、測試上傳與下載資料:

  存取資料時,客戶端必須首先連線至RADOS叢集上某儲存池,然後根據物件名稱由相關的CRUSH規格完成資料物件定址,於是,為了測試叢集的資料存取功能,這裡首先先建立一個用於測試的儲存池mypool,並設定其PG數量為32個。

$ ceph -h #一個更底層的命令
$ rados -h #
  建立pool
[ceph@ceph-deploy ceph-cluster]$ ceph osd pool create mypool 32 32                             #32個PG和32種PGD組合
pool 'mypool' created
cephadmin@ceph-deploy:~/ceph-cluster$ ceph pg ls-by-pool mypool | awk '{print $1,$2,$15}'     #驗證PG與PGP組合
PG OBJECTS ACTING
2.0 0 [8,10,3]p8
2.1 0 [15,0,13]p15
2.2 0 [5,1,15]p5
2.3 0 [17,5,14]p17
2.4 0 [1,12,18]p1
2.5 0 [12,4,8]p12
2.6 0 [1,13,19]p1
2.7 0 [6,17,2]p6
2.8 0 [16,13,0]p16
2.9 0 [4,9,19]p4
2.a 0 [11,4,18]p11
2.b 0 [13,7,17]p13
2.c 0 [12,0,5]p12
2.d 0 [12,19,3]p12
2.e 0 [2,13,19]p2
2.f 0 [11,17,8]p11
2.10 0 [15,13,0]p15
2.11 0 [16,6,1]p16
2.12 0 [10,3,9]p10
2.13 0 [17,6,3]p17
2.14 0 [8,13,17]p8
2.15 0 [19,1,11]p19
2.16 0 [8,12,17]p8
2.17 0 [6,14,2]p6
2.18 0 [18,9,12]p18
2.19 0 [3,6,13]p3
2.1a 0 [6,14,2]p6
2.1b 0 [11,7,17]p11
2.1c 0 [10,7,1]p10
2.1d 0 [15,10,7]p15
2.1e 0 [3,13,15]p3
2.1f 0 [4,7,14]p4
  
* NOTE: afterwards
cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd tree          #檢視osd與儲存伺服器的對應關係
ID  CLASS  WEIGHT   TYPE NAME            STATUS  REWEIGHT  PRI-AFF
-1         1.95374  root default                                  
-5         0.48843      host ceph-node2                           
 5    hdd  0.09769          osd.5            up   1.00000  1.00000
 6    hdd  0.09769          osd.6            up   1.00000  1.00000
 7    hdd  0.09769          osd.7            up   1.00000  1.00000
 8    hdd  0.09769          osd.8            up   1.00000  1.00000
 9    hdd  0.09769          osd.9            up   1.00000  1.00000
-7         0.48843      host ceph-node3                           
10    hdd  0.09769          osd.10           up   1.00000  1.00000
11    hdd  0.09769          osd.11           up   1.00000  1.00000
12    hdd  0.09769          osd.12           up   1.00000  1.00000
13    hdd  0.09769          osd.13           up   1.00000  1.00000
14    hdd  0.09769          osd.14           up   1.00000  1.00000
-9         0.48843      host ceph-node4                           
15    hdd  0.09769          osd.15           up   1.00000  1.00000
16    hdd  0.09769          osd.16           up   1.00000  1.00000
17    hdd  0.09769          osd.17           up   1.00000  1.00000
18    hdd  0.09769          osd.18           up   1.00000  1.00000
19    hdd  0.09769          osd.19           up   1.00000  1.00000
-3         0.48843      host cepn-node1                           
 0    hdd  0.09769          osd.0            up   1.00000  1.00000
 1    hdd  0.09769          osd.1            up   1.00000  1.00000
 2    hdd  0.09769          osd.2            up   1.00000  1.00000
 3    hdd  0.09769          osd.3            up   1.00000  1.00000
 4    hdd  0.09769          osd.4            up   1.00000  1.00000
cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd pool ls    #檢視當前儲存池
device_health_metrics
mypool
cephadmin@ceph-deploy:~/ceph-cluster$ rados lspools       #檢視當前儲存池
device_health_metrics
mypool

  當前的ceph環境還沒有部署使用塊儲存和檔案系統使用ceph,也沒有使用物件儲存的客戶端,但是ceph的rados命令可以實現訪問ceph物件儲存的功能:

  上傳檔案:
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados put msg1 /var/log/syslog --pool=mypool               #把檔案上傳至mypool並指定物件ID為msg1
  列出檔案:
cephadmin@ceph-deploy:~/ceph-cluster$ rados ls --pool=mypool
msg1
  檔案資訊:

  ceph osd map 命令:獲取到儲存池中資料物件的具體位置資訊

cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd map mypool msg1
osdmap e131 pool 'mypool' (2) object 'msg1' -> pg 2.c833d430 (2.10) -> up ([15,13,0], p15) acting ([15,13,0], p15)

  2.c844d430:表示檔案放在了儲存池ID為2的c844d430的PG上

  2.10:表示資料儲存在ID為2的儲存池、ID為10的PG中

  [15,13,0],p15:OSD編號,主OSD為15,活動的OSD 15,13,0,三個OSD表示資料存放一共3個副本,PG中的OSD是ceph的crush演算法計算出三份資料儲存在哪些OSD中

  下載檔案:
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados get msg1 --pool=mypool /opt/my.txt
cephadmin@ceph-deploy:~/ceph-cluster$ ll /opt/
total 1840
drwxr-xr-x  2 root root    4096 Aug 20 15:23 ./
drwxr-xr-x 23 root root    4096 Aug 18 18:32 ../
-rw-r--r--  1 root root 1873597 Aug 20 15:23 my.txt

#驗證下載檔案: cephadmin
@ceph-deploy:~/ceph-cluster$ head /opt/my.txt Aug 18 18:33:40 ceph-deploy systemd-modules-load[484]: Inserted module 'iscsi_tcp' Aug 18 18:33:40 ceph-deploy systemd-modules-load[484]: Inserted module 'ib_iser' Aug 18 18:33:40 ceph-deploy systemd[1]: Starting Flush Journal to Persistent Storage... Aug 18 18:33:40 ceph-deploy systemd[1]: Started Load/Save Random Seed. Aug 18 18:33:40 ceph-deploy systemd[1]: Started Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling. Aug 18 18:33:40 ceph-deploy systemd[1]: Started udev Kernel Device Manager. Aug 18 18:33:40 ceph-deploy systemd[1]: Started Set the console keyboard layout. Aug 18 18:33:40 ceph-deploy systemd[1]: Reached target Local File Systems (Pre). Aug 18 18:33:40 ceph-deploy systemd[1]: Reached target Local File Systems. Aug 18 18:33:40 ceph-deploy systemd[1]: Starting Tell Plymouth To Write Out Runtime Data...
  修改檔案:
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados put msg1 /etc/passwd --pool=mypool
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados get msg1 --pool=mypool /opt/my1.txt

#驗證修改後的檔案 cephadmin
@ceph-deploy:~/ceph-cluster$ tail /opt/my1.txt _apt:x:104:65534::/nonexistent:/usr/sbin/nologin lxd:x:105:65534::/var/lib/lxd/:/bin/false uuidd:x:106:110::/run/uuidd:/usr/sbin/nologin dnsmasq:x:107:65534:dnsmasq,,,:/var/lib/misc:/usr/sbin/nologin landscape:x:108:112::/var/lib/landscape:/usr/sbin/nologin sshd:x:109:65534::/run/sshd:/usr/sbin/nologin pollinate:x:110:1::/var/cache/pollinate:/bin/false wang:x:1000:1000:wang,,,:/home/wang:/bin/bash cephadmin:x:2022:2022::/home/cephadmin:/bin/bash ceph:x:64045:64045:Ceph storage service:/var/lib/ceph:/usr/sbin/nologin
  刪除檔案
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados rm msg1 --pool=mypool
cephadmin@ceph-deploy:~/ceph-cluster$ rados ls --pool=mypool
cephadmin@ceph-deploy:~/ceph-cluster$ 

  12、擴充套件ceph叢集實現高可用:

  主要是擴充套件ceph叢集的mon節點以及mgr節點以實現叢集高可用。

  擴充套件ceph-mon節點:

  Ceph-mon是原生具備自選舉以實現高可用機制的ceph服務,節點數量通常是奇數。

root@ceph-mon2:~# apt -y install ceph-mon
root@ceph-mon3:~# apt -y install ceph-mon

cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mon add ceph-mon2
cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mon add ceph-mon3
  驗證ceph-mon狀態:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph quorum_status --format json-pretty

{
    "election_epoch": 14,
    "quorum": [
        0,
        1,
        2
    ],
    "quorum_names": [
        "ceph-mon1",
        "ceph-mon2",
        "ceph-mon3"
    ],
    "quorum_leader_name": "ceph-mon1",              #當前的leader
    "quorum_age": 304,
    "features": {
        "quorum_con": "4540138297136906239",
        "quorum_mon": [
            "kraken",
            "luminous",
            "mimic",
            "osdmap-prune",
            "nautilus",
            "octopus",
            "pacific",
            "elector-pinging"
        ]
    },
    "monmap": {
        "epoch": 3,
        "fsid": "0d11d338-a480-40da-8520-830423b22c3e",
        "modified": "2021-08-20T07:39:56.803507Z",
        "created": "2021-08-18T07:55:40.349602Z",
        "min_mon_release": 16,
        "min_mon_release_name": "pacific",
        "election_strategy": 1,
        "disallowed_leaders: ": "",
        "stretch_mode": false,
        "features": {
            "persistent": [
                "kraken",
                "luminous",
                "mimic",
                "osdmap-prune",
                "nautilus",
                "octopus",
                "pacific",
                "elector-pinging"
            ],
            "optional": []
        },
        "mons": [
            {
                "rank": 0,                         #當前節點等級
                "name": "ceph-mon1",                   #當前節點名稱
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "10.0.0.101:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "10.0.0.101:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.0.0.101:6789/0",                     #監聽地址
                "public_addr": "10.0.0.101:6789/0",            #監聽地址
                "priority": 0,
                "weight": 0,
                "crush_location": "{}"
            },
            {
                "rank": 1,
                "name": "ceph-mon2",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "10.0.0.102:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "10.0.0.102:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.0.0.102:6789/0",
                "public_addr": "10.0.0.102:6789/0",
                "priority": 0,
                "weight": 0,
                "crush_location": "{}"
            },
            {
                "rank": 2,
                "name": "ceph-mon3",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "10.0.0.103:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "10.0.0.103:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.0.0.103:6789/0",
                "public_addr": "10.0.0.103:6789/0",
                "priority": 0,
                "weight": 0,
                "crush_location": "{}"
            }
        ]
    }
}
cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s
  cluster:
    id:     0d11d338-a480-40da-8520-830423b22c3e
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 28m)
    mgr: ceph-mgr1(active, since 6h)
    osd: 20 osds: 20 up (since 6h), 20 in (since 45h)
 
  data:
    pools:   2 pools, 33 pgs
    objects: 0 objects, 0 B
    usage:   165 MiB used, 2.0 TiB / 2.0 TiB avail
    pgs:     33 active+clean
 

  擴充套件mgr節點

root@ceph-mgr2:~# apt -y install ceph-mgr

cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mgr create ceph-mgr2

  驗證mgr節點狀態

cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s
  cluster:
    id:     0d11d338-a480-40da-8520-830423b22c3e
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 33m)
    mgr: ceph-mgr1(active, since 6h), standbys: ceph-mgr2
    osd: 20 osds: 20 up (since 6h), 20 in (since 45h)
 
  data:
    pools:   2 pools, 33 pgs
    objects: 0 objects, 0 B
    usage:   165 MiB used, 2.0 TiB / 2.0 TiB avail
    pgs:     33 active+clean
 

四、塊裝置RBD

  RBD(RADOS Block Devices)即為塊儲存的一種,RBD通過librbd庫與OSD進行互動,RBD為KVM等虛擬化技術和雲服務(如OpenStack和CloudStack)提供高效能和無線可拓展性的儲存後端,這些系統依賴libvirt和QWMU使用程式與RBD進行整合,客戶端基於librbd庫即可將RADOS儲存叢集用作塊裝置,用於rbd的儲存池需要先啟用rbd功能並進行初始化。

  • 建立RBD

$ ceph osd pool create <pool> [<pg_num:int>] [<pgp_num:int>] [replicated|erasure]      #建立儲存池命令格式

cephadmin
@ceph-deploy:~/ceph-cluster$ ceph osd pool create myrbd1 64 64            #建立儲存池,指定pg和pgd的數量,pgp是對存在於pg的資料進行組合儲存,pgp通常等於pg的值 pool 'myrbd1' created cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd pool application enable myrbd1 rbd      #對儲存池啟用RBD功能 enabled application 'rbd' on pool 'myrbd1' cephadmin@ceph-deploy:~/ceph-cluster$ rbd pool init -p myrbd1                  #通過RBD命令對儲存池初始化
  • 建立並驗證img:

  rbd儲存池並不能直接用於塊裝置,而是需要事先在其中按需建立映像(image),並把映像檔案作為塊裝置使用,rbd命令可用於建立、檢視及刪除塊裝置所在的映像(image),以及克隆映像、建立快照、將映像回滾到快照和檢視快照等管理操作。

cephadmin@ceph-deploy:~/ceph-cluster$ rbd create myimg1 --size 5G --pool myrbd1
cephadmin@ceph-deploy:~/ceph-cluster$ rbd create myimg2 --size 3G --pool myrbd1 --image-format 2 --image-feature layering                  #centos系統核心較低無法掛載使用,因此只開啟部分特性。
                                                                                                                                           #除了layering 其他特性需要高版本核心支援

cephadmin@ceph-deploy:~/ceph-cluster$ rbd ls --pool myrbd1              #列出指定的pool中所有的img
myimg1
myimg2
cephadmin@ceph-deploy:~/ceph-cluster$ rbd --image myimg1 --pool myrbd1 info     #檢視指定rbd的資訊
rbd image 'myimg1':
    size 5 GiB in 1280 objects
    order 22 (4 MiB objects)
    snapshot_count: 0
    id: 38ee810c4674
    block_name_prefix: rbd_data.38ee810c4674
    format: 2
    features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
    op_features: 
    flags: 
    create_timestamp: Fri Aug 20 18:08:52 2021
    access_timestamp: Fri Aug 20 18:08:52 2021
    modify_timestamp: Fri Aug 20 18:08:52 2021
cephadmin@ceph-deploy:~/ceph-cluster$ rbd --image myimg2 --pool myrbd1 info
rbd image 'myimg2':
    size 3 GiB in 768 objects
    order 22 (4 MiB objects)
    snapshot_count: 0
    id: 38f7fea54b29
    block_name_prefix: rbd_data.38f7fea54b29
    format: 2
    features: layering
    op_features: 
    flags: 
    create_timestamp: Fri Aug 20 18:09:51 2021
    access_timestamp: Fri Aug 20 18:09:51 2021
    modify_timestamp: Fri Aug 20 18:09:51 2021
  • 客戶端使用塊儲存:

  1、檢視當前ceph狀態:

cephadmin@ceph-deploy:~/ceph-cluster$ ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    2.0 TiB  2.0 TiB  169 MiB   169 MiB          0
TOTAL  2.0 TiB  2.0 TiB  169 MiB   169 MiB          0
 
--- POOLS ---
POOL                   ID  PGS  STORED  OBJECTS    USED  %USED  MAX AVAIL
device_health_metrics   1    1     0 B        0     0 B      0    633 GiB
mypool                  2   32     0 B        0     0 B      0    633 GiB
myrbd1                  3   64   405 B        7  48 KiB      0    633 GiB

  2、在客戶端安裝ceph-common:

[root@ceph-client1 ~]# yum install epel-release                         #配置yum源
[root@ceph-client1 ~]# yum -y install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm
[root@ceph-client1 ~]# yum -y install ceph-common

#從部署伺服器同步認證檔案
cephadmin@ceph-deploy:~/ceph-cluster$ scp ceph.conf ceph.client.admin.keyring 10.0.0.71:/etc/ceph

  3、客戶端對映img:

[root@ceph-client1 ~]# rbd -p myrbd1 map myimg2
/dev/rbd0
[root@ceph-client1 ~]# rbd -p myrbd1 map myimg1
rbd: sysfs write failed
RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable myrbd1/myimg1 object-map fast-diff deep-flatten".
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (6) No such device or address
[root@ceph-client1 ~]# rbd feature disable myrbd1/myimg1 object-map fast-diff deep-flatten
[root@ceph-client1 ~]# rbd -p myrbd1 map myimg1
/dev/rbd1

  4、客戶端驗證RBD:

[root@ceph-client1 ~]# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda      8:0    0  200G  0 disk 
├─sda1   8:1    0    1G  0 part /boot
├─sda2   8:2    0  100G  0 part /
├─sda3   8:3    0   50G  0 part /data
├─sda4   8:4    0    1K  0 part 
└─sda5   8:5    0    4G  0 part [SWAP]
sr0     11:0    1 1024M  0 rom  
rbd0   253:0    0    3G  0 disk 
rbd1   253:16   0    5G  0 disk 

  5、客戶端格式化磁碟並掛載使用:

[root@ceph-client1 ~]# mkfs.ext4 /dev/rbd0
mke2fs 1.42.9 (28-Dec-2013)
Discarding device blocks: done                            
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=1024 blocks, Stripe width=1024 blocks
196608 inodes, 786432 blocks
39321 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=805306368
24 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
    32768, 98304, 163840, 229376, 294912

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done 

[root@ceph-client1 ~]# mkfs.xfs /dev/rbd1
Discarding blocks...Done.
meta-data=/dev/rbd1              isize=512    agcount=8, agsize=163840 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=1310720, imaxpct=25
         =                       sunit=1024   swidth=1024 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@ceph-client1 ~]# mount /dev/rbd0 /mnt/
[root@ceph-client1 ~]# df -TH
Filesystem     Type      Size  Used Avail Use% Mounted on
devtmpfs       devtmpfs  943M     0  943M   0% /dev
tmpfs          tmpfs     954M     0  954M   0% /dev/shm
tmpfs          tmpfs     954M   11M  944M   2% /run
tmpfs          tmpfs     954M     0  954M   0% /sys/fs/cgroup
/dev/sda2      xfs       108G  5.2G  103G   5% /
/dev/sda3      xfs        54G  175M   54G   1% /data
/dev/sda1      xfs       1.1G  150M  915M  15% /boot
tmpfs          tmpfs     191M     0  191M   0% /run/user/0
/dev/rbd0      ext4      3.2G  9.5M  3.0G   1% /mnt
[root@ceph-client1 ~]# mkdir /data
[root@ceph-client1 ~]# mount /dev/rbd1  /data/
[root@ceph-client1 ~]# df -TH
Filesystem Type Size Used Avail Use% Mounted on
devtmpfs devtmpfs 943M 0 943M 0% /dev
tmpfs tmpfs 954M 0 954M 0% /dev/shm
tmpfs tmpfs 954M 10M 944M 2% /run
tmpfs tmpfs 954M 0 954M 0% /sys/fs/cgroup
/dev/sda2 xfs 108G 5.2G 103G 5% /
/dev/rbd1 xfs 5.4G 35M 5.4G 1% /data
/dev/sda1 xfs 1.1G 150M 915M 15% /boot
tmpfs tmpfs 191M 0 191M 0% /run/user/0
/dev/rbd0 ext4 3.2G 9.5M 3.0G 1% /mnt

  6、客戶端驗證:

[root@ceph-client1 data]# dd if=/dev/zero of=/data/ceph-test-file bs=1MB count=300
300+0 records in
300+0 records out
300000000 bytes (300 MB) copied, 1.61684 s, 186 MB/s
[root@ceph-client1 data]# file /data/ceph-test-file 
/data/ceph-test-file: data
[root@ceph-client1 data]# ll -h /data/ceph-test-file 
-rw-r--r--. 1 root root 287M Aug 23 12:27 /data/ceph-test-file

  7、ceph驗證資料:

cephadmin@cepn-node1:~$ ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    2.0 TiB  2.0 TiB  2.3 GiB   2.3 GiB       0.11
TOTAL  2.0 TiB  2.0 TiB  2.3 GiB   2.3 GiB       0.11
 
--- POOLS ---
POOL                   ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
device_health_metrics   1    1      0 B        0      0 B      0    632 GiB
myrbd1                  2   64  363 MiB      115  1.1 GiB   0.06    632 GiB