DAY.1-Ceph元件、資料讀寫流程、叢集搭建及RBD使用
一、Ceph元件:
1.OSD(Object Storage Daemon)
功能:Ceph OSDs(物件儲存守護程式ceph-osd):提供資料儲存,作業系統上的一個磁碟就是一個OSD守護程式,用於處理ceph叢集資料複製、回覆、重新平衡,並通過檢查其他Ceph OSD守護程式的心跳來向Ceph監視器和管理器提供一些監視資訊,實現冗餘和高可用性至少需要3個Ceph OSD。
2.Mon (monitor):ceph的監視器
功能:一個主機上執行的一個守護程序,用於維護叢集狀態對映(maintains maps of the cluster state),如ceph叢集中有多少儲存池、每個儲存池有多少個PG以及儲存池的PG的對映關係等,一個ceph叢集至少有一個Mon(1,3,5,7...),Ceph守護程式相互協調所需的關鍵叢集狀態有:monitor map,manager map,the OSD map,the MDS map和the CRUSH map。
3.Mgr(Manager)管理器
功能:一個主機上執行的一個守護程序,Ceph Manager守護程式負責跟蹤執行時,指標和Ceph叢集的當前狀態,包括儲存利用率,當前效能指標和系統負載。還託管基於Python的模組來管理和公開Ceph叢集資訊,包括基於Web的Ceph儀表板和REST API。高可用至少需要兩個管理器。
二、Ceph的資料讀寫流程:
- 計算檔案到物件的對映,得到oid(object id)= ino+non:
- ino:iNode number (INO),File的元資料序列號,File的唯一id
- ono:object number (ONO),File切分產生的某個object的序號,預設以4M切分一個塊大小
- 通過hash演算法計算出檔案對應的pool中的PG:
通過一致性HASH計算object到PG,Object --> PG對映的hash(oid)&mask --> pgid
- 通過CRUSH把物件對映到PG中的OSD
通過CRUSH演算法計算PG到OSD,PG --> OSD對映:[CRUSH(pgid)->(osd1,osd2,osd3)]
- PG中的主OSD將物件寫入到硬碟
- 主OSD將資料同步到備份OSD,並等待備份OSD返回確認
- 主OSD將寫入完成返回給客戶端。
說明:
Pool:儲存池、分割槽,儲存池的大小取決於底層的儲存空間。
PG(placement group):一個pool內部可以有多個PG存在,Pool和PG都是抽象的邏輯概念,一個pool中有多少個PG可以通過公式計算。
OSD(Object storage Daemon,物件儲存裝置):每一塊磁碟都是一個osd,一個主機由一個或多個osd組成。
ceph叢集部署好之後,要先建立儲存池才能向ceph寫入資料,檔案在向ceph儲存之前要先進行一致性hash計算,計算後會把檔案儲存在某個對應的PG中,此檔案一定屬於某個pool的一個PG,在通過PG儲存在OSD上。資料物件在寫到主OSD之後再同步到從OSD以實現資料的高可用。
三、部署ceph叢集
伺服器角色 | 系統版本 | IP地址 | 基本配置及分割槽大小 |
ceph-deploy | Ubuntu 1804 | 10.0.0.100/192.168.0.100 | 2C2G/120G |
ceph-mon1 | Ubuntu 1804 | 10.0.0.101/192.168.0.101 | 2C2G/120G |
ceph-mon2 | Ubuntu 1804 | 10.0.0.102/192.168.0.102 | 2C2G/120G |
ceph-mon3 | Ubuntu 1804 | 10.0.0.103/192.168.0.103 | 2C2G/120G |
ceph-mgr1 | Ubuntu 1804 | 10.0.0.104/192.168.0.104 | 2C2G/120G |
ceph-mgr2 | Ubuntu 1804 | 10.0.0.105/192.168.0.105 | 2C2G/120G |
ceph-node1 | Ubuntu 1804 | 10.0.0.106/192.168.0.106 | 2C2G/120G+100*5 |
ceph-node2 | Ubuntu 1804 | 10.0.0.107/192.168.0.107 | 2C2G/120G+100*5 |
ceph-node3 | Ubuntu 1804 | 10.0.0.108/192.168.0.108 | 2C2G/120G+100*5 |
ceph-node4 | Ubuntu 1804 | 10.0.0.109/192.168.0.109 | 2C2G/120G+100*5 |
環境簡介:
1、一個伺服器用語部署ceph叢集即安裝ceph-deploy,也可以和cepy-mgr等複用。
10.0.0.100/192.168.0.100
2、三臺伺服器作為ceph叢集Mon監控伺服器,每臺伺服器可以和ceph叢集的cluster網路通訊
10.0.0.101/192.168.0.101
10.0.0.102/192.168.0.102
10.0.0.103/192.168.0.103
3、兩個ceph-mgr管理器,可以和ceph叢集的cluster網路通訊
10.0.0.104/192.168.0.104
10.0.0.105/192.168.0.105
4、四臺伺服器作為ceph叢集OSD儲存伺服器,每臺伺服器支援兩個網路,public網路針對客戶端使用,cluster網路用於叢集管理及資料同步,每臺3塊以上硬碟
10.0.0.106/192.168.0.106
10.0.0.107/192.168.0.107
10.0.0.108/192.168.0.108
10.0.0.109/192.168.0.109
#各儲存伺服器磁碟劃分:
/dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf #100G
5、建立一個普通使用者,能夠通過sudo執行特權命令,配置主機名解析,ceph叢集部署過程中需要對各主機配置不同的主機名,另外如果是centos系統則需要關閉各伺服器的防火牆和selinux。
-
Ubuntu server系統基礎配置
1、更改主機名
# cat /etc/hostname Ubuntu1804 # hostnamectl set-hostname ceph-deploy.example.lcoal #hostnamectl set-hostname 更改後的主機名 # cat /etc/hostname ceph-deploy.example.lcoal
2、更改網絡卡名稱為eth*
方法一:安裝Ubuntu系統介面時傳遞核心引數:net.ifnames=0 biosdevname=0
方法二:如果沒有在安裝系統之前傳遞核心引數將網絡卡名稱更改為eth*,可以通過如下方式更改(需重啟Ubuntu系統):
3、配置root遠端登入
預設情況下,Ubuntu不允許root使用者遠端ssh,需新增root密碼並編輯/etc/ssh/sshd_config檔案:
$ sudo vim /etc/ssh/sshd_config 32 #PermitRootLogin prohibit-password 33 PermitRootLogin yes #允許root登入 101 #UseDNS no 102 UseDNS no #關閉DNS解析
$ sudo su - root #切換至root使用者 # passwd #設定root密碼 Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully # systemctl restart sshd #重啟ssh服務
4、各節點配置伺服器網絡卡,例如ceph-deploy:
root@ceph-deploy:~# cat /etc/netplan/01-netcfg.yaml # This file describes the network interfaces available on your system # For more information, see netplan(5). network: version: 2 renderer: networkd ethernets: eth0: dhcp4: no dhcp6: no addresses: [10.0.0.100/24] gateway4: 10.0.0.2 nameservers: addresses: [10.0.0.2, 114.114.114.114, 8.8.8.8] eth1: dhcp4: no dhcp6: no addresses: [192.168.0.100/24] root@ceph-deploy:~# ifconfig eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.0.100 netmask 255.255.255.0 broadcast 10.0.0.255 inet6 fe80::20c:29ff:fe65:a300 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:65:a3:00 txqueuelen 1000 (Ethernet) RX packets 2057 bytes 172838 (172.8 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1575 bytes 221983 (221.9 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.0.100 netmask 255.255.255.0 broadcast 192.168.0.255 inet6 fe80::20c:29ff:fe65:a30a prefixlen 64 scopeid 0x20<link> ether 00:0c:29:65:a3:0a txqueuelen 1000 (Ethernet) RX packets 2 bytes 486 (486.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 14 bytes 1076 (1.0 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 182 bytes 14992 (14.9 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 182 bytes 14992 (14.9 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 root@ceph-deploy:~# ping -c 1 -i 1 www.baidu.com
64 bytes from 220.181.38.149 (220.181.38.149): icmp_seq=1 ttl=128 time=6.67 ms
5、配置apt倉庫
https://mirrors.aliyun.com/ceph/ #阿里雲映象倉庫
http://mirrors.163.com/ceph/ #網易映象倉庫
https://mirrors.tuna.tsinghua.edu.cn/ceph/ #清華大學映象源倉庫
$ wget -q -O- 'https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc' | sudo apt-key add - #匯入key檔案 OK $ sudo echo "deb https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-pacific bionic main" >> /etc/apt/sources.list $ cat /etc/apt/sources.list # 預設註釋了原始碼映象以提高 apt update 速度,如有需要可自行取消註釋 deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic main #echo追加該條內容
$ sudo apt update
-
部署RADOS叢集
1、建立cephadmin使用者:
推薦使用指定的普通使用者部署、執行ceph叢集,普通使用者只要能以互動式方式使用sudo命令執行一些特權命令即可,新版的ceph-deploy可以指定包含root在內的只要可以執行sudo命令的使用者,不過仍推薦使用普通使用者,如:cephuser、cephadmin這樣的使用者去管理ceph叢集。
在包含ceph-deploy節點的儲存節點、Mon節點和mgr節點建立cephadmin使用者。
groupadd -r -g 2022 cephadmin && useradd -r -m -s /bin/bash -u 2022 -g 2022 cephadmin && echo cephadmin:123.com | chpasswd
各服務允許cephadmin使用者使用sudo執行特權命令:
echo "cephadmin ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
2、配置免祕鑰登入:
在ceph-deploy節點配置允許以非互動的方式登入到各ceph node/mon/mgr節點,及在ceph-deploy節點生成金鑰對,然後分發公鑰到各被管理節點:
cephadmin@ceph-deploy:~$ ssh-keygen #生成ssh祕鑰對 Generating public/private rsa key pair. Enter file in which to save the key (/home/cephadmin/.ssh/id_rsa): Created directory '/home/cephadmin/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/cephadmin/.ssh/id_rsa. Your public key has been saved in /home/cephadmin/.ssh/id_rsa.pub. The key fingerprint is: SHA256:0+vL5tnFkcEzFiGCmKTzR7G58KHrbUB9qBiaqtYsSi4 cephadmin@ceph-deploy The key's randomart image is: +---[RSA 2048]----+ | ..o... . o. | | .o .+ . o . | | o ..=. * | | .o.=o+. . = | | o +o.S.. o | | o . oo . . . . | | oo .. . o | |Eo o . ..o.o . | |B.. ...o*.. | +----[SHA256]-----+ cephadmin@ceph-deploy:~$ ssh-copy-id [email protected] #分發公鑰至各管理節點(包括自身) cephadmin@ceph-deploy:~$ ssh-copy-id [email protected] cephadmin@ceph-deploy:~$ ssh-copy-id [email protected] cephadmin@ceph-deploy:~$ ssh-copy-id [email protected] cephadmin@ceph-deploy:~$ ssh-copy-id [email protected] cephadmin@ceph-deploy:~$ ssh-copy-id [email protected] cephadmin@ceph-deploy:~$ ssh-copy-id [email protected] cephadmin@ceph-deploy:~$ ssh-copy-id [email protected] cephadmin@ceph-deploy:~$ ssh-copy-id [email protected] cephadmin@ceph-deploy:~$ ssh-copy-id [email protected]
3、各節點配置域名解析:
# cat >> /etc/hosts << EOF 10.0.0.100 ceph-deploy 10.0.0.101 ceph-mon1 10.0.0.102 ceph-mon2 10.0.0.103 ceph-mon3 10.0.0.104 ceph-mgr1 10.0.0.105 ceph-mgr2 10.0.0.106 ceph-node1 10.0.0.107 ceph-node2 10.0.0.108 ceph-node3 10.0.0.109 ceph-node4 EOF
4、在各節點安裝Python2:
# apt -y install python2.7 #安裝Python2.7 # ln -sv /usr/bin/python2.7 /usr/bin/python2 #建立軟連結
5、安裝ceph部署工具
在ceph部署伺服器安裝部署工具ceph-deploy
cephadmin@ceph-deploy:~$ apt-cache madison ceph-deploy ceph-deploy | 2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic/main amd64 Packages ceph-deploy | 2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic/main i386 Packages ceph-deploy | 1.5.38-0ubuntu1 | https://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe amd64 Packages ceph-deploy | 1.5.38-0ubuntu1 | https://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe i386 Packages cephadmin@ceph-deploy:~$ sudo apt -y install ceph-deploy
6、初始化Mon節點
在ceph-deploy管理節點初始化Mon節點,Mon節點也需要有cluster network,否則初始化會報錯;
cephadmin@ceph-deploy:~$ mkdir ceph-cluster cephadmin@ceph-deploy:~$ cd ceph-cluster/ cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy new --cluster-network 192.168.0.0/24 --public-network 10.0.0.0/24 ceph-mon1
驗證初始化:
cephadmin@ceph-deploy:~/ceph-cluster$ ll total 20 drwxrwxr-x 2 cephadmin cephadmin 4096 Aug 18 15:26 ./ drwxr-xr-x 6 cephadmin cephadmin 4096 Aug 18 15:20 ../ -rw-rw-r-- 1 cephadmin cephadmin 259 Aug 18 15:26 ceph.conf #自動生成的配置檔案 -rw-rw-r-- 1 cephadmin cephadmin 3892 Aug 18 15:26 ceph-deploy-ceph.log #初始化日誌 -rw------- 1 cephadmin cephadmin 73 Aug 18 15:26 ceph.mon.keyring #用於ceph Mon節點內部通訊認證的祕鑰環檔案 cephadmin@ceph-deploy:~/ceph-cluster$ cat ceph.conf [global] fsid = 0d11d338-a480-40da-8520-830423b22c3e #ceph的叢集ID public_network = 10.0.0.0/24 cluster_network = 192.168.0.0/24 mon_initial_members = ceph-mon1 #可以用逗號做分割新增多個mon節點 mon_host = 10.0.0.101 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx
配置Mon節點並生成同步祕鑰
在各Mon節點安裝元件ceph-mon,並初始化Mon節點,Mon節點可以後期橫向擴容
root@ceph-mon1:~# apt -y install ceph-mon cephadmin@ceph-deploy:~/ceph-cluster$ pwd /home/cephadmin/ceph-cluster cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mon create-initial [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon create-initial [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] subcommand : create-initial [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f0903be4fa0> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] func : <function mon at 0x7f0903bc8ad0> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] keyrings : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-mon1 [ceph_deploy.mon][DEBUG ] detecting platform for host ceph-mon1 ... [ceph-mon1][DEBUG ] connection detected need for sudo [ceph-mon1][DEBUG ] connected to host: ceph-mon1 [ceph-mon1][DEBUG ] detect platform information from remote host [ceph-mon1][DEBUG ] detect machine type [ceph-mon1][DEBUG ] find the location of an executable [ceph_deploy.mon][INFO ] distro info: Ubuntu 18.04 bionic [ceph-mon1][DEBUG ] determining if provided host has same hostname in remote [ceph-mon1][DEBUG ] get remote short hostname [ceph-mon1][DEBUG ] deploying mon to ceph-mon1 [ceph-mon1][DEBUG ] get remote short hostname [ceph-mon1][DEBUG ] remote hostname: ceph-mon1 [ceph-mon1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph-mon1][DEBUG ] create the mon path if it does not exist [ceph-mon1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-mon1/done [ceph-mon1][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-mon1/done [ceph-mon1][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring [ceph-mon1][DEBUG ] create the monitor keyring file [ceph-mon1][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i ceph-mon1 --keyring /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring --setuser 64045 --setgroup 64045 [ceph-mon1][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring [ceph-mon1][DEBUG ] create a done file to avoid re-doing the mon deployment [ceph-mon1][DEBUG ] create the init path if it does not exist [ceph-mon1][INFO ] Running command: sudo systemctl enable ceph.target [ceph-mon1][INFO ] Running command: sudo systemctl enable ceph-mon@ceph-mon1 [ceph-mon1][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/[email protected] → /lib/systemd/system/ceph-mon@.service. [ceph-mon1][INFO ] Running command: sudo systemctl start ceph-mon@ceph-mon1 [ceph-mon1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status [ceph-mon1][DEBUG ] ******************************************************************************** [ceph-mon1][DEBUG ] status for monitor: mon.ceph-mon1 [ceph-mon1][DEBUG ] { [ceph-mon1][DEBUG ] "election_epoch": 3, [ceph-mon1][DEBUG ] "extra_probe_peers": [], [ceph-mon1][DEBUG ] "feature_map": { [ceph-mon1][DEBUG ] "mon": [ [ceph-mon1][DEBUG ] { [ceph-mon1][DEBUG ] "features": "0x3f01cfb9fffdffff", [ceph-mon1][DEBUG ] "num": 1, [ceph-mon1][DEBUG ] "release": "luminous" [ceph-mon1][DEBUG ] } [ceph-mon1][DEBUG ] ] [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] "features": { [ceph-mon1][DEBUG ] "quorum_con": "4540138297136906239", [ceph-mon1][DEBUG ] "quorum_mon": [ [ceph-mon1][DEBUG ] "kraken", [ceph-mon1][DEBUG ] "luminous", [ceph-mon1][DEBUG ] "mimic", [ceph-mon1][DEBUG ] "osdmap-prune", [ceph-mon1][DEBUG ] "nautilus", [ceph-mon1][DEBUG ] "octopus", [ceph-mon1][DEBUG ] "pacific", [ceph-mon1][DEBUG ] "elector-pinging" [ceph-mon1][DEBUG ] ], [ceph-mon1][DEBUG ] "required_con": "2449958747317026820", [ceph-mon1][DEBUG ] "required_mon": [ [ceph-mon1][DEBUG ] "kraken", [ceph-mon1][DEBUG ] "luminous", [ceph-mon1][DEBUG ] "mimic", [ceph-mon1][DEBUG ] "osdmap-prune", [ceph-mon1][DEBUG ] "nautilus", [ceph-mon1][DEBUG ] "octopus", [ceph-mon1][DEBUG ] "pacific", [ceph-mon1][DEBUG ] "elector-pinging" [ceph-mon1][DEBUG ] ] [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] "monmap": { [ceph-mon1][DEBUG ] "created": "2021-08-18T07:55:40.349602Z", [ceph-mon1][DEBUG ] "disallowed_leaders: ": "", [ceph-mon1][DEBUG ] "election_strategy": 1, [ceph-mon1][DEBUG ] "epoch": 1, [ceph-mon1][DEBUG ] "features": { [ceph-mon1][DEBUG ] "optional": [], [ceph-mon1][DEBUG ] "persistent": [ [ceph-mon1][DEBUG ] "kraken", [ceph-mon1][DEBUG ] "luminous", [ceph-mon1][DEBUG ] "mimic", [ceph-mon1][DEBUG ] "osdmap-prune", [ceph-mon1][DEBUG ] "nautilus", [ceph-mon1][DEBUG ] "octopus", [ceph-mon1][DEBUG ] "pacific", [ceph-mon1][DEBUG ] "elector-pinging" [ceph-mon1][DEBUG ] ] [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] "fsid": "0d11d338-a480-40da-8520-830423b22c3e", [ceph-mon1][DEBUG ] "min_mon_release": 16, [ceph-mon1][DEBUG ] "min_mon_release_name": "pacific", [ceph-mon1][DEBUG ] "modified": "2021-08-18T07:55:40.349602Z", [ceph-mon1][DEBUG ] "mons": [ [ceph-mon1][DEBUG ] { [ceph-mon1][DEBUG ] "addr": "10.0.0.101:6789/0", [ceph-mon1][DEBUG ] "crush_location": "{}", [ceph-mon1][DEBUG ] "name": "ceph-mon1", [ceph-mon1][DEBUG ] "priority": 0, [ceph-mon1][DEBUG ] "public_addr": "10.0.0.101:6789/0", [ceph-mon1][DEBUG ] "public_addrs": { [ceph-mon1][DEBUG ] "addrvec": [ [ceph-mon1][DEBUG ] { [ceph-mon1][DEBUG ] "addr": "10.0.0.101:3300", [ceph-mon1][DEBUG ] "nonce": 0, [ceph-mon1][DEBUG ] "type": "v2" [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] { [ceph-mon1][DEBUG ] "addr": "10.0.0.101:6789", [ceph-mon1][DEBUG ] "nonce": 0, [ceph-mon1][DEBUG ] "type": "v1" [ceph-mon1][DEBUG ] } [ceph-mon1][DEBUG ] ] [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] "rank": 0, [ceph-mon1][DEBUG ] "weight": 0 [ceph-mon1][DEBUG ] } [ceph-mon1][DEBUG ] ], [ceph-mon1][DEBUG ] "stretch_mode": false [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] "name": "ceph-mon1", [ceph-mon1][DEBUG ] "outside_quorum": [], [ceph-mon1][DEBUG ] "quorum": [ [ceph-mon1][DEBUG ] 0 [ceph-mon1][DEBUG ] ], [ceph-mon1][DEBUG ] "quorum_age": 1, [ceph-mon1][DEBUG ] "rank": 0, [ceph-mon1][DEBUG ] "state": "leader", [ceph-mon1][DEBUG ] "stretch_mode": false, [ceph-mon1][DEBUG ] "sync_provider": [] [ceph-mon1][DEBUG ] } [ceph-mon1][DEBUG ] ******************************************************************************** [ceph-mon1][INFO ] monitor: mon.ceph-mon1 is running [ceph-mon1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status [ceph_deploy.mon][INFO ] processing monitor mon.ceph-mon1 [ceph-mon1][DEBUG ] connection detected need for sudo [ceph-mon1][DEBUG ] connected to host: ceph-mon1 [ceph-mon1][DEBUG ] detect platform information from remote host [ceph-mon1][DEBUG ] detect machine type [ceph-mon1][DEBUG ] find the location of an executable [ceph-mon1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status [ceph_deploy.mon][INFO ] mon.ceph-mon1 monitor has reached quorum! [ceph_deploy.mon][INFO ] all initial monitors are running and have formed quorum [ceph_deploy.mon][INFO ] Running gatherkeys... [ceph_deploy.gatherkeys][INFO ] Storing keys in temp directory /tmp/tmpqCeuN6 [ceph-mon1][DEBUG ] connection detected need for sudo [ceph-mon1][DEBUG ] connected to host: ceph-mon1 [ceph-mon1][DEBUG ] detect platform information from remote host [ceph-mon1][DEBUG ] detect machine type [ceph-mon1][DEBUG ] get remote short hostname [ceph-mon1][DEBUG ] fetch remote file [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph-mon1.asok mon_status [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.admin [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-mds [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-mgr [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-osd [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-rgw [ceph_deploy.gatherkeys][INFO ] Storing ceph.client.admin.keyring [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mds.keyring [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mgr.keyring [ceph_deploy.gatherkeys][INFO ] keyring 'ceph.mon.keyring' already exists [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-osd.keyring [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-rgw.keyring [ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmpqCeuN6
驗證Mon節點
驗證在mon節點已經安裝並啟動了ceph-mon服務,並且後期在ceph-deploy節點初始化目錄會生成一些bootstrap ceph mds/mgr/osd/rgw等服務的keyring認證檔案,這些初始化檔案擁有對ceph叢集的最高特權,所以一定要儲存好。
root@ceph-mon1:~# ps -ef | grep ceph-mon ceph 6688 1 0 15:55 ? 00:00:00 /usr/bin/ceph-mon -f --cluster ceph --id ceph-mon1 --setuser ceph --setgroup ceph root 7252 2514 0 16:00 pts/0 00:00:00 grep --color=auto ceph-mon
7、配置manager節點
部署ceph-mgr節點:
mgr節點需要讀取ceph的配置檔案,及/etc/ceph目錄中的配置檔案
#初始化ceph-mgr節點: root@ceph-mgr1:~# apt -y install ceph-mgr cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mgr create ceph-mgr1 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mgr create ceph-mgr1 [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] mgr : [('ceph-mgr1', 'ceph-mgr1')] [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] subcommand : create [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f8d17024c30> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] func : <function mgr at 0x7f8d17484150> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-mgr1:ceph-mgr1 The authenticity of host 'ceph-mgr1 (10.0.0.104)' can't be established. ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'ceph-mgr1' (ECDSA) to the list of known hosts. [ceph-mgr1][DEBUG ] connection detected need for sudo [ceph-mgr1][DEBUG ] connected to host: ceph-mgr1 [ceph-mgr1][DEBUG ] detect platform information from remote host [ceph-mgr1][DEBUG ] detect machine type [ceph_deploy.mgr][INFO ] Distro info: Ubuntu 18.04 bionic [ceph_deploy.mgr][DEBUG ] remote host will use systemd [ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-mgr1 [ceph-mgr1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph-mgr1][WARNIN] mgr keyring does not exist yet, creating one [ceph-mgr1][DEBUG ] create a keyring file [ceph-mgr1][DEBUG ] create path recursively if it doesn't exist [ceph-mgr1][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-mgr1 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-mgr1/keyring [ceph-mgr1][INFO ] Running command: sudo systemctl enable ceph-mgr@ceph-mgr1 [ceph-mgr1][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/[email protected] → /lib/systemd/system/ceph-mgr@.service. [ceph-mgr1][INFO ] Running command: sudo systemctl start ceph-mgr@ceph-mgr1 [ceph-mgr1][INFO ] Running command: sudo systemctl enable ceph.target
驗證ceph-mgr節點:
root@ceph-mgr1:~# ps -ef | grep ceph-mgr ceph 8128 1 8 17:09 ? 00:00:03 /usr/bin/ceph-mgr -f --cluster ceph --id ceph-mgr1 --setuser ceph --setgroup ceph root 8326 2396 0 17:10 pts/0 00:00:00 grep --color=auto ceph-mgr
8、分發admin祕鑰:
在ceph-deploy節點把配置檔案和admin祕鑰拷貝至ceph叢集需要執行ceph管理命令的節點,從而不需要後期通過ceph命令對ceph叢集進行管理配置的時候每次都需要指定ceph-mon節點地址和ceph.client.admin.keyring檔案,另外各ceph-mon節點也需要同步ceph的叢集配置檔案及認證檔案。
在ceph-deploy節點管理叢集:
root@ceph-deploy:~# apt -y install ceph-common #安裝ceph公共元件,安裝ceph-common需要使用root root@cepn-node1:~# apt -y install ceph-common root@cepn-node2:~# apt -y install ceph-common root@cepn-node3:~# apt -y install ceph-common root@cepn-node4:~# apt -y install ceph-common cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy admin ceph-deploy ceph-node1 ceph-node2 ceph-node3 ceph-node4 #分發admin祕鑰 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy admin ceph-deploy [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f4ba41a4190> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] client : ['ceph-deploy'] [ceph_deploy.cli][INFO ] func : <function admin at 0x7f4ba4aa5a50> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-deploy [ceph-deploy][DEBUG ] connection detected need for sudo [ceph-deploy][DEBUG ] connected to host: ceph-deploy [ceph-deploy][DEBUG ] detect platform information from remote host [ceph-deploy][DEBUG ] detect machine type [ceph-deploy][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy admin ceph-node1 ceph-node2 ceph-node3 ceph-node4 [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fc78eac3190> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] client : ['ceph-node1', 'ceph-node2', 'ceph-node3', 'ceph-node4'] [ceph_deploy.cli][INFO ] func : <function admin at 0x7fc78f3c4a50> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node1 The authenticity of host 'ceph-node1 (10.0.0.106)' can't be established. ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'ceph-node1' (ECDSA) to the list of known hosts. [ceph-node1][DEBUG ] connection detected need for sudo [ceph-node1][DEBUG ] connected to host: ceph-node1 [ceph-node1][DEBUG ] detect platform information from remote host [ceph-node1][DEBUG ] detect machine type [ceph-node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node2 The authenticity of host 'ceph-node2 (10.0.0.107)' can't be established. ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'ceph-node2' (ECDSA) to the list of known hosts. [ceph-node2][DEBUG ] connection detected need for sudo [ceph-node2][DEBUG ] connected to host: ceph-node2 [ceph-node2][DEBUG ] detect platform information from remote host [ceph-node2][DEBUG ] detect machine type [ceph-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node3 The authenticity of host 'ceph-node3 (10.0.0.108)' can't be established. ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'ceph-node3' (ECDSA) to the list of known hosts. [ceph-node3][DEBUG ] connection detected need for sudo [ceph-node3][DEBUG ] connected to host: ceph-node3 [ceph-node3][DEBUG ] detect platform information from remote host [ceph-node3][DEBUG ] detect machine type [ceph-node3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node4 The authenticity of host 'ceph-node4 (10.0.0.109)' can't be established. ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'ceph-node4' (ECDSA) to the list of known hosts. [ceph-node4][DEBUG ] connection detected need for sudo [ceph-node4][DEBUG ] connected to host: ceph-node4 [ceph-node4][DEBUG ] detect platform information from remote host [ceph-node4][DEBUG ] detect machine type [ceph-node4][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
ceph節點驗證祕鑰:
到ceph-node節點驗證key檔案
root@cepn-node1:~# ll /etc/ceph/ total 20 drwxr-xr-x 2 root root 4096 Aug 18 16:38 ./ drwxr-xr-x 91 root root 4096 Aug 18 16:27 ../ -rw------- 1 root root 151 Aug 18 16:38 ceph.client.admin.keyring -rw-r--r-- 1 root root 259 Aug 18 16:38 ceph.conf -rw-r--r-- 1 root root 92 Jul 8 22:17 rbdmap -rw------- 1 root root 0 Aug 18 16:38 tmp4MDGPp
認證檔案的屬主和屬組為了安全考慮,預設設定為root使用者和root組,如果需要cephadmin使用者也能執行ceph命令,需要對cephadmin使用者進行授權
cephadmin@ceph-deploy:~/ceph-cluster$ sudo setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring root@cepn-node1:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring root@cepn-node2:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring root@cepn-node3:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring root@cepn-node4:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring
測試ceph命令:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: 0d11d338-a480-40da-8520-830423b22c3e health: HEALTH_WARN mon is allowing insecure global_id reclaim #需要禁用非安全模式通訊 OSD count 0 < osd_pool_default_size 3 #叢集的OSD數量小於3 services: mon: 1 daemons, quorum ceph-mon1 (age 98m) mgr: ceph-mgr1(active, since 24m) osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs: cephadmin@ceph-deploy:~/ceph-cluster$ ceph config set mon auth_allow_insecure_global_id_reclaim false #禁用非安全模式通訊 cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: 0d11d338-a480-40da-8520-830423b22c3e health: HEALTH_WARN OSD count 0 < osd_pool_default_size 3 services: mon: 1 daemons, quorum ceph-mon1 (age 100m) mgr: ceph-mgr1(active, since 26m) osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs: cephadmin@ceph-deploy:~/ceph-cluster$
9、初始化node節點過程
新增OSD之前,需要對node節點安裝基本環境:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy install --no-adjust-repos --nogpgcheck ceph-node1 ceph-node2 ceph-node3 ceph-node4
--no-adjust-repos install packages without modifying source repos
--nogpgcheck install packages without gpgcheck
擦除硬碟
cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy disk list ceph-node1 ceph-node2 ceph-node3 ceph-node4 #列出遠端儲存node節點的磁碟資訊使用ceph-deploy disk zap 擦除各caph node資料磁碟
cephadmin@ceph-deploy:~/ceph-cluster$ cat EraseDisk.sh #!/bin/bash # for i in {1..4}; do for d in {b..f}; do ceph-deploy disk zap ceph-node$i /dev/sd$d done done cephadmin@ceph-deploy:~/ceph-cluster$ bash -n EraseDisk.sh cephadmin@ceph-deploy:~/ceph-cluster$ bash EraseDisk.sh
新增主機的磁碟OSD:
資料分類儲存方式:
- Data:即ceph儲存的物件資料
- Block:rocks DB資料,即元資料
- block-wal:資料庫的wal日誌
新增OSD(OSD的ID從0開始順序使用):
cephadmin@ceph-deploy:~/ceph-cluster$ cat CreateDisk.sh #!/bin/bash # for i in {1..4}; do for d in {b..f}; do ceph-deploy osd create ceph-node$i --data /dev/sd$d done done cephadmin@ceph-deploy:~/ceph-cluster$ bash -n CreateDisk.sh cephadmin@ceph-deploy:~/ceph-cluster$ bash CreateDisk.sh
10、驗證ceph叢集:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: 0d11d338-a480-40da-8520-830423b22c3e health: HEALTH_OK services: mon: 1 daemons, quorum ceph-mon1 (age 2h) mgr: ceph-mgr1(active, since 88m) osd: 20 osds: 20 up (since 55s), 20 in (since 63s) data: pools: 1 pools, 1 pgs objects: 0 objects, 0 B usage: 150 MiB used, 2.0 TiB / 2.0 TiB avail pgs: 1 active+clean
11、測試上傳與下載資料:
存取資料時,客戶端必須首先連線至RADOS叢集上某儲存池,然後根據物件名稱由相關的CRUSH規格完成資料物件定址,於是,為了測試叢集的資料存取功能,這裡首先先建立一個用於測試的儲存池mypool,並設定其PG數量為32個。
$ ceph -h #一個更底層的命令 $ rados -h #
建立pool
[ceph@ceph-deploy ceph-cluster]$ ceph osd pool create mypool 32 32 #32個PG和32種PGD組合 pool 'mypool' created cephadmin@ceph-deploy:~/ceph-cluster$ ceph pg ls-by-pool mypool | awk '{print $1,$2,$15}' #驗證PG與PGP組合 PG OBJECTS ACTING 2.0 0 [8,10,3]p8 2.1 0 [15,0,13]p15 2.2 0 [5,1,15]p5 2.3 0 [17,5,14]p17 2.4 0 [1,12,18]p1 2.5 0 [12,4,8]p12 2.6 0 [1,13,19]p1 2.7 0 [6,17,2]p6 2.8 0 [16,13,0]p16 2.9 0 [4,9,19]p4 2.a 0 [11,4,18]p11 2.b 0 [13,7,17]p13 2.c 0 [12,0,5]p12 2.d 0 [12,19,3]p12 2.e 0 [2,13,19]p2 2.f 0 [11,17,8]p11 2.10 0 [15,13,0]p15 2.11 0 [16,6,1]p16 2.12 0 [10,3,9]p10 2.13 0 [17,6,3]p17 2.14 0 [8,13,17]p8 2.15 0 [19,1,11]p19 2.16 0 [8,12,17]p8 2.17 0 [6,14,2]p6 2.18 0 [18,9,12]p18 2.19 0 [3,6,13]p3 2.1a 0 [6,14,2]p6 2.1b 0 [11,7,17]p11 2.1c 0 [10,7,1]p10 2.1d 0 [15,10,7]p15 2.1e 0 [3,13,15]p3 2.1f 0 [4,7,14]p4 * NOTE: afterwards cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd tree #檢視osd與儲存伺服器的對應關係 ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 1.95374 root default -5 0.48843 host ceph-node2 5 hdd 0.09769 osd.5 up 1.00000 1.00000 6 hdd 0.09769 osd.6 up 1.00000 1.00000 7 hdd 0.09769 osd.7 up 1.00000 1.00000 8 hdd 0.09769 osd.8 up 1.00000 1.00000 9 hdd 0.09769 osd.9 up 1.00000 1.00000 -7 0.48843 host ceph-node3 10 hdd 0.09769 osd.10 up 1.00000 1.00000 11 hdd 0.09769 osd.11 up 1.00000 1.00000 12 hdd 0.09769 osd.12 up 1.00000 1.00000 13 hdd 0.09769 osd.13 up 1.00000 1.00000 14 hdd 0.09769 osd.14 up 1.00000 1.00000 -9 0.48843 host ceph-node4 15 hdd 0.09769 osd.15 up 1.00000 1.00000 16 hdd 0.09769 osd.16 up 1.00000 1.00000 17 hdd 0.09769 osd.17 up 1.00000 1.00000 18 hdd 0.09769 osd.18 up 1.00000 1.00000 19 hdd 0.09769 osd.19 up 1.00000 1.00000 -3 0.48843 host cepn-node1 0 hdd 0.09769 osd.0 up 1.00000 1.00000 1 hdd 0.09769 osd.1 up 1.00000 1.00000 2 hdd 0.09769 osd.2 up 1.00000 1.00000 3 hdd 0.09769 osd.3 up 1.00000 1.00000 4 hdd 0.09769 osd.4 up 1.00000 1.00000 cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd pool ls #檢視當前儲存池 device_health_metrics mypool cephadmin@ceph-deploy:~/ceph-cluster$ rados lspools #檢視當前儲存池 device_health_metrics mypool
當前的ceph環境還沒有部署使用塊儲存和檔案系統使用ceph,也沒有使用物件儲存的客戶端,但是ceph的rados命令可以實現訪問ceph物件儲存的功能:
上傳檔案:
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados put msg1 /var/log/syslog --pool=mypool #把檔案上傳至mypool並指定物件ID為msg1
列出檔案:
cephadmin@ceph-deploy:~/ceph-cluster$ rados ls --pool=mypool
msg1
檔案資訊:
ceph osd map 命令:獲取到儲存池中資料物件的具體位置資訊
cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd map mypool msg1
osdmap e131 pool 'mypool' (2) object 'msg1' -> pg 2.c833d430 (2.10) -> up ([15,13,0], p15) acting ([15,13,0], p15)
2.c844d430:表示檔案放在了儲存池ID為2的c844d430的PG上
2.10:表示資料儲存在ID為2的儲存池、ID為10的PG中
[15,13,0],p15:OSD編號,主OSD為15,活動的OSD 15,13,0,三個OSD表示資料存放一共3個副本,PG中的OSD是ceph的crush演算法計算出三份資料儲存在哪些OSD中
下載檔案:
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados get msg1 --pool=mypool /opt/my.txt cephadmin@ceph-deploy:~/ceph-cluster$ ll /opt/ total 1840 drwxr-xr-x 2 root root 4096 Aug 20 15:23 ./ drwxr-xr-x 23 root root 4096 Aug 18 18:32 ../ -rw-r--r-- 1 root root 1873597 Aug 20 15:23 my.txt
#驗證下載檔案: cephadmin@ceph-deploy:~/ceph-cluster$ head /opt/my.txt Aug 18 18:33:40 ceph-deploy systemd-modules-load[484]: Inserted module 'iscsi_tcp' Aug 18 18:33:40 ceph-deploy systemd-modules-load[484]: Inserted module 'ib_iser' Aug 18 18:33:40 ceph-deploy systemd[1]: Starting Flush Journal to Persistent Storage... Aug 18 18:33:40 ceph-deploy systemd[1]: Started Load/Save Random Seed. Aug 18 18:33:40 ceph-deploy systemd[1]: Started Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling. Aug 18 18:33:40 ceph-deploy systemd[1]: Started udev Kernel Device Manager. Aug 18 18:33:40 ceph-deploy systemd[1]: Started Set the console keyboard layout. Aug 18 18:33:40 ceph-deploy systemd[1]: Reached target Local File Systems (Pre). Aug 18 18:33:40 ceph-deploy systemd[1]: Reached target Local File Systems. Aug 18 18:33:40 ceph-deploy systemd[1]: Starting Tell Plymouth To Write Out Runtime Data...
修改檔案:
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados put msg1 /etc/passwd --pool=mypool cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados get msg1 --pool=mypool /opt/my1.txt
#驗證修改後的檔案 cephadmin@ceph-deploy:~/ceph-cluster$ tail /opt/my1.txt _apt:x:104:65534::/nonexistent:/usr/sbin/nologin lxd:x:105:65534::/var/lib/lxd/:/bin/false uuidd:x:106:110::/run/uuidd:/usr/sbin/nologin dnsmasq:x:107:65534:dnsmasq,,,:/var/lib/misc:/usr/sbin/nologin landscape:x:108:112::/var/lib/landscape:/usr/sbin/nologin sshd:x:109:65534::/run/sshd:/usr/sbin/nologin pollinate:x:110:1::/var/cache/pollinate:/bin/false wang:x:1000:1000:wang,,,:/home/wang:/bin/bash cephadmin:x:2022:2022::/home/cephadmin:/bin/bash ceph:x:64045:64045:Ceph storage service:/var/lib/ceph:/usr/sbin/nologin
刪除檔案
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados rm msg1 --pool=mypool cephadmin@ceph-deploy:~/ceph-cluster$ rados ls --pool=mypool cephadmin@ceph-deploy:~/ceph-cluster$
12、擴充套件ceph叢集實現高可用:
主要是擴充套件ceph叢集的mon節點以及mgr節點以實現叢集高可用。
擴充套件ceph-mon節點:
Ceph-mon是原生具備自選舉以實現高可用機制的ceph服務,節點數量通常是奇數。
root@ceph-mon2:~# apt -y install ceph-mon root@ceph-mon3:~# apt -y install ceph-mon cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mon add ceph-mon2 cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mon add ceph-mon3
驗證ceph-mon狀態:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph quorum_status --format json-pretty { "election_epoch": 14, "quorum": [ 0, 1, 2 ], "quorum_names": [ "ceph-mon1", "ceph-mon2", "ceph-mon3" ], "quorum_leader_name": "ceph-mon1", #當前的leader "quorum_age": 304, "features": { "quorum_con": "4540138297136906239", "quorum_mon": [ "kraken", "luminous", "mimic", "osdmap-prune", "nautilus", "octopus", "pacific", "elector-pinging" ] }, "monmap": { "epoch": 3, "fsid": "0d11d338-a480-40da-8520-830423b22c3e", "modified": "2021-08-20T07:39:56.803507Z", "created": "2021-08-18T07:55:40.349602Z", "min_mon_release": 16, "min_mon_release_name": "pacific", "election_strategy": 1, "disallowed_leaders: ": "", "stretch_mode": false, "features": { "persistent": [ "kraken", "luminous", "mimic", "osdmap-prune", "nautilus", "octopus", "pacific", "elector-pinging" ], "optional": [] }, "mons": [ { "rank": 0, #當前節點等級 "name": "ceph-mon1", #當前節點名稱 "public_addrs": { "addrvec": [ { "type": "v2", "addr": "10.0.0.101:3300", "nonce": 0 }, { "type": "v1", "addr": "10.0.0.101:6789", "nonce": 0 } ] }, "addr": "10.0.0.101:6789/0", #監聽地址 "public_addr": "10.0.0.101:6789/0", #監聽地址 "priority": 0, "weight": 0, "crush_location": "{}" }, { "rank": 1, "name": "ceph-mon2", "public_addrs": { "addrvec": [ { "type": "v2", "addr": "10.0.0.102:3300", "nonce": 0 }, { "type": "v1", "addr": "10.0.0.102:6789", "nonce": 0 } ] }, "addr": "10.0.0.102:6789/0", "public_addr": "10.0.0.102:6789/0", "priority": 0, "weight": 0, "crush_location": "{}" }, { "rank": 2, "name": "ceph-mon3", "public_addrs": { "addrvec": [ { "type": "v2", "addr": "10.0.0.103:3300", "nonce": 0 }, { "type": "v1", "addr": "10.0.0.103:6789", "nonce": 0 } ] }, "addr": "10.0.0.103:6789/0", "public_addr": "10.0.0.103:6789/0", "priority": 0, "weight": 0, "crush_location": "{}" } ] } } cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: 0d11d338-a480-40da-8520-830423b22c3e health: HEALTH_OK services: mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 28m) mgr: ceph-mgr1(active, since 6h) osd: 20 osds: 20 up (since 6h), 20 in (since 45h) data: pools: 2 pools, 33 pgs objects: 0 objects, 0 B usage: 165 MiB used, 2.0 TiB / 2.0 TiB avail pgs: 33 active+clean
擴充套件mgr節點
root@ceph-mgr2:~# apt -y install ceph-mgr cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mgr create ceph-mgr2
驗證mgr節點狀態
cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: 0d11d338-a480-40da-8520-830423b22c3e health: HEALTH_OK services: mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 33m) mgr: ceph-mgr1(active, since 6h), standbys: ceph-mgr2 osd: 20 osds: 20 up (since 6h), 20 in (since 45h) data: pools: 2 pools, 33 pgs objects: 0 objects, 0 B usage: 165 MiB used, 2.0 TiB / 2.0 TiB avail pgs: 33 active+clean
四、塊裝置RBD
RBD(RADOS Block Devices)即為塊儲存的一種,RBD通過librbd庫與OSD進行互動,RBD為KVM等虛擬化技術和雲服務(如OpenStack和CloudStack)提供高效能和無線可拓展性的儲存後端,這些系統依賴libvirt和QWMU使用程式與RBD進行整合,客戶端基於librbd庫即可將RADOS儲存叢集用作塊裝置,用於rbd的儲存池需要先啟用rbd功能並進行初始化。
-
建立RBD
$ ceph osd pool create <pool> [<pg_num:int>] [<pgp_num:int>] [replicated|erasure] #建立儲存池命令格式
cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd pool create myrbd1 64 64 #建立儲存池,指定pg和pgd的數量,pgp是對存在於pg的資料進行組合儲存,pgp通常等於pg的值 pool 'myrbd1' created cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd pool application enable myrbd1 rbd #對儲存池啟用RBD功能 enabled application 'rbd' on pool 'myrbd1' cephadmin@ceph-deploy:~/ceph-cluster$ rbd pool init -p myrbd1 #通過RBD命令對儲存池初始化
-
建立並驗證img:
rbd儲存池並不能直接用於塊裝置,而是需要事先在其中按需建立映像(image),並把映像檔案作為塊裝置使用,rbd命令可用於建立、檢視及刪除塊裝置所在的映像(image),以及克隆映像、建立快照、將映像回滾到快照和檢視快照等管理操作。
cephadmin@ceph-deploy:~/ceph-cluster$ rbd create myimg1 --size 5G --pool myrbd1 cephadmin@ceph-deploy:~/ceph-cluster$ rbd create myimg2 --size 3G --pool myrbd1 --image-format 2 --image-feature layering #centos系統核心較低無法掛載使用,因此只開啟部分特性。 #除了layering 其他特性需要高版本核心支援 cephadmin@ceph-deploy:~/ceph-cluster$ rbd ls --pool myrbd1 #列出指定的pool中所有的img myimg1 myimg2 cephadmin@ceph-deploy:~/ceph-cluster$ rbd --image myimg1 --pool myrbd1 info #檢視指定rbd的資訊 rbd image 'myimg1': size 5 GiB in 1280 objects order 22 (4 MiB objects) snapshot_count: 0 id: 38ee810c4674 block_name_prefix: rbd_data.38ee810c4674 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten op_features: flags: create_timestamp: Fri Aug 20 18:08:52 2021 access_timestamp: Fri Aug 20 18:08:52 2021 modify_timestamp: Fri Aug 20 18:08:52 2021 cephadmin@ceph-deploy:~/ceph-cluster$ rbd --image myimg2 --pool myrbd1 info rbd image 'myimg2': size 3 GiB in 768 objects order 22 (4 MiB objects) snapshot_count: 0 id: 38f7fea54b29 block_name_prefix: rbd_data.38f7fea54b29 format: 2 features: layering op_features: flags: create_timestamp: Fri Aug 20 18:09:51 2021 access_timestamp: Fri Aug 20 18:09:51 2021 modify_timestamp: Fri Aug 20 18:09:51 2021
-
客戶端使用塊儲存:
1、檢視當前ceph狀態:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 2.0 TiB 2.0 TiB 169 MiB 169 MiB 0 TOTAL 2.0 TiB 2.0 TiB 169 MiB 169 MiB 0 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 1 1 0 B 0 0 B 0 633 GiB mypool 2 32 0 B 0 0 B 0 633 GiB myrbd1 3 64 405 B 7 48 KiB 0 633 GiB
2、在客戶端安裝ceph-common:
[root@ceph-client1 ~]# yum install epel-release #配置yum源 [root@ceph-client1 ~]# yum -y install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm [root@ceph-client1 ~]# yum -y install ceph-common
#從部署伺服器同步認證檔案
cephadmin@ceph-deploy:~/ceph-cluster$ scp ceph.conf ceph.client.admin.keyring 10.0.0.71:/etc/ceph
3、客戶端對映img:
[root@ceph-client1 ~]# rbd -p myrbd1 map myimg2 /dev/rbd0 [root@ceph-client1 ~]# rbd -p myrbd1 map myimg1 rbd: sysfs write failed RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable myrbd1/myimg1 object-map fast-diff deep-flatten". In some cases useful info is found in syslog - try "dmesg | tail". rbd: map failed: (6) No such device or address [root@ceph-client1 ~]# rbd feature disable myrbd1/myimg1 object-map fast-diff deep-flatten [root@ceph-client1 ~]# rbd -p myrbd1 map myimg1 /dev/rbd1
4、客戶端驗證RBD:
[root@ceph-client1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 200G 0 disk ├─sda1 8:1 0 1G 0 part /boot ├─sda2 8:2 0 100G 0 part / ├─sda3 8:3 0 50G 0 part /data ├─sda4 8:4 0 1K 0 part └─sda5 8:5 0 4G 0 part [SWAP] sr0 11:0 1 1024M 0 rom rbd0 253:0 0 3G 0 disk rbd1 253:16 0 5G 0 disk
5、客戶端格式化磁碟並掛載使用:
[root@ceph-client1 ~]# mkfs.ext4 /dev/rbd0 mke2fs 1.42.9 (28-Dec-2013) Discarding device blocks: done Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=1024 blocks, Stripe width=1024 blocks 196608 inodes, 786432 blocks 39321 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=805306368 24 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done [root@ceph-client1 ~]# mkfs.xfs /dev/rbd1 Discarding blocks...Done. meta-data=/dev/rbd1 isize=512 agcount=8, agsize=163840 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=0, sparse=0 data = bsize=4096 blocks=1310720, imaxpct=25 = sunit=1024 swidth=1024 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=8 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [root@ceph-client1 ~]# mount /dev/rbd0 /mnt/ [root@ceph-client1 ~]# df -TH Filesystem Type Size Used Avail Use% Mounted on devtmpfs devtmpfs 943M 0 943M 0% /dev tmpfs tmpfs 954M 0 954M 0% /dev/shm tmpfs tmpfs 954M 11M 944M 2% /run tmpfs tmpfs 954M 0 954M 0% /sys/fs/cgroup /dev/sda2 xfs 108G 5.2G 103G 5% / /dev/sda3 xfs 54G 175M 54G 1% /data /dev/sda1 xfs 1.1G 150M 915M 15% /boot tmpfs tmpfs 191M 0 191M 0% /run/user/0 /dev/rbd0 ext4 3.2G 9.5M 3.0G 1% /mnt [root@ceph-client1 ~]# mkdir /data [root@ceph-client1 ~]# mount /dev/rbd1 /data/ [root@ceph-client1 ~]# df -TH Filesystem Type Size Used Avail Use% Mounted on devtmpfs devtmpfs 943M 0 943M 0% /dev tmpfs tmpfs 954M 0 954M 0% /dev/shm tmpfs tmpfs 954M 10M 944M 2% /run tmpfs tmpfs 954M 0 954M 0% /sys/fs/cgroup /dev/sda2 xfs 108G 5.2G 103G 5% / /dev/rbd1 xfs 5.4G 35M 5.4G 1% /data /dev/sda1 xfs 1.1G 150M 915M 15% /boot tmpfs tmpfs 191M 0 191M 0% /run/user/0 /dev/rbd0 ext4 3.2G 9.5M 3.0G 1% /mnt
6、客戶端驗證:
[root@ceph-client1 data]# dd if=/dev/zero of=/data/ceph-test-file bs=1MB count=300 300+0 records in 300+0 records out 300000000 bytes (300 MB) copied, 1.61684 s, 186 MB/s [root@ceph-client1 data]# file /data/ceph-test-file /data/ceph-test-file: data [root@ceph-client1 data]# ll -h /data/ceph-test-file -rw-r--r--. 1 root root 287M Aug 23 12:27 /data/ceph-test-file
7、ceph驗證資料:
cephadmin@cepn-node1:~$ ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 2.0 TiB 2.0 TiB 2.3 GiB 2.3 GiB 0.11 TOTAL 2.0 TiB 2.0 TiB 2.3 GiB 2.3 GiB 0.11 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 1 1 0 B 0 0 B 0 632 GiB myrbd1 2 64 363 MiB 115 1.1 GiB 0.06 632 GiB