因为要在 k8s 内使用存储,考虑删除容器后文件还能持久保存,所以要装个分布式文件存储系统,当前比较火的莫过于 ceph 了。
本文记了个部署 ceph 的流水账。
工具选择
参考:https://docs.ceph.com/en/latest/install/
网页上的推荐方法
cephadm 使用容器
自动部署需要 pull docker io 的包,即使你先离线导入了,还是会去请求,获取最新的 commit id,离线不可用rook 使用容器
ceph-ansible 使用 Ansible.
- 意味着服务端需要安装 ansible
- 没有集成 Nautlius 和 Octopus 版本新加入的 orchestrator API,不能使用新的管理功能,面板集成也不能用
ceph-deploy
- 快速部署 ceph 的工具
- 没有在活跃的维护,官方没有在 Nautilus 以后的版本测试过,不支持 RHEL8, CentOS 8, 或者更新的操作系统.
我们的机器环境是 CentOS 7,犹豫再三,相比手动安装,我选择了 ceph-deploy ,最新支持到 nautilus 版本
Host
10.101.235.84 ceph-1
- sda 500G
- data 3T raid0 x 12 ( sdb - sdm)
- mem 128G
- cpu 24c
10.101.235.217 ceph-2
- sda 446G
- data 6T raid0 x 12 ( sdb - sdm)
- mem 256G
- cpu 32c
10.101.235.252 ceph-3
- sda 414G
- sdb 60T 12*6T raid5 (还没有重做 raid)
- mem 256G
- cpu 32c
准备工作
- host
hostnamectl set-hostname ceph-1
hostnamectl set-hostname ceph-2
hostnamectl set-hostname ceph-3
写入 /etc/hosts1
2
310.101.235.84 ceph-1
10.101.235.217 ceph-2
10.101.235.38 ceph-3
- ssh 免密
ssh-keygen
ssh-copy-id ceph-1
ssh-copy-id ceph-2
ssh-copy-id ceph-3
- 安全设置
关闭 selinux 和防火墙
for i in {1..3};do echo $i;ssh ceph-$i “systemctl disable –now firewalld”;done
for i in {1..3};do echo $i;ssh ceph-$i “setenforce 0”;done
for i in {1..3};do echo $i;ssh ceph-$i “sed -i ‘s/^SELINUX=.*/SELINUX=disabled/‘ /etc/selinux/config”;done
- ntp
ceph-1 上
vi /etc/ntp.conf1
server 10.110.38.240 minpoll 4 maxpoll 5
ntpq -pn 查看同步状态1
2
3
4[[email protected] ceph-cluster]# ntpq -pn
remote refid st t when poll reach delay offset jitter
==============================================================================
*10.110.38.240 10.108.84.45 3 u 10 16 377 0.901 -283.35 0.407
ceph-2 ceph-3 上 写 ceph-1 的地址 10.101.235.84
vi /etc/ntp.conf1
server 10.101.235.84 iburst
ntpq -pn
查看同步状态
同步完成,ip 前会显示*号
生产环境最好配置多个 ntp server
for i in {1..3};do echo $i;ssh ceph-$i “date”;done
- yum 源
ceph-deploy version 2.0.0+
http://download.ceph.com/rpm-nautilus/el7/noarch/
ceph.repo1
2
3
4
5[ceph]
name=ceph
baseurl=http://10.110.38.20/ceph-nautilus/
enable=1
gpgcheck=0
yum makecache
- 安装 ceph-deploy
在 ceph-1
yum install python-setuptools ceph-deploy
ceph-deploy –version
deploy
在 ceph-1
mkdir my-cluster
cd my-cluster
安装会生成 ceph.conf keyring
ceph-deploy new –cluster-network=10.101.235.1/24 ceph-11
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34[[email protected] ceph]# ceph-deploy new --cluster-network=10.101.235.1/24 ceph-1
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy new --cluster-network=10.101.235.1/24 ceph-1
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] func : <function new at 0x1bb8c80>
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x1c19f80>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] ssh_copykey : True
[ceph_deploy.cli][INFO ] mon : ['ceph-1']
[ceph_deploy.cli][INFO ] public_network : None
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster_network : 10.101.235.1/24
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] fsid : None
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO ] making sure passwordless SSH succeeds
[ceph-1][DEBUG ] connected to host: ceph-1
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph-1][DEBUG ] find the location of an executable
[ceph-1][INFO ] Running command: /usr/sbin/ip link show
[ceph-1][INFO ] Running command: /usr/sbin/ip addr show
[ceph-1][DEBUG ] IP addresses found: [u'10.101.235.84']
[ceph_deploy.new][DEBUG ] Resolving host ceph-1
[ceph_deploy.new][DEBUG ] Monitor ceph-1 at 10.101.235.84
[ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph-1']
[ceph_deploy.new][DEBUG ] Monitor addrs are ['10.101.235.84']
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...
ceph-deploy install {ceph-node} […]
会自动配置覆盖 yum 源,这里就手动安装
yum install -y ceph ceph-mon ceph-mgr ceph-radosgw ceph-mds
1 | Error: Package: librdkafka-0.11.5-1.el7.x86_64 (ceph) |
在公网的机器上1
2
3
4
5yumdownloader --resolve libsemanage.x86_64
yumdownloader --resolve libsepol.x86_64
yumdownloader --resolve libselinux-utils
yumdownloader --resolve libselinux
yumdownloader --resolve lz4
liblz4 https://serverfault.com/questions/917688/unable-to-update-centos-7-yum-update-broken
把 rpm 放在一个目录下1
yum install *
如果单个 yum install 会提示你这样依赖不对,那样依赖不对,没那么智能,要一起安装才行
node 1 2 3 分别安装1
yum install -y ceph ceph-mon ceph-mgr ceph-radosgw ceph-mds
在 node 1 上
cd my-cluster
ceph-deploy mon create-initial # node 1 初始化 mon1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155[[email protected] ceph]# ceph-deploy mon create-initial
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy mon create-initial
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create-initial
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x1784248>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function mon at 0x177b2a8>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] keyrings : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-1
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-1 ...
[ceph-1][DEBUG ] connected to host: ceph-1
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph-1][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO ] distro info: CentOS Linux 7.4.1708 Core
[ceph-1][DEBUG ] determining if provided host has same hostname in remote
[ceph-1][DEBUG ] get remote short hostname
[ceph-1][DEBUG ] deploying mon to ceph-1
[ceph-1][DEBUG ] get remote short hostname
[ceph-1][DEBUG ] remote hostname: ceph-1
[ceph-1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-1][DEBUG ] create the mon path if it does not exist
[ceph-1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-1/done
[ceph-1][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-1/done
[ceph-1][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-1.mon.keyring
[ceph-1][DEBUG ] create the monitor keyring file
[ceph-1][INFO ] Running command: ceph-mon --cluster ceph --mkfs -i ceph-1 --keyring /var/lib/ceph/tmp/ceph-ceph-1.mon.keyring --setuser 167 --setgroup 167
[ceph-1][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-1.mon.keyring
[ceph-1][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph-1][DEBUG ] create the init path if it does not exist
[ceph-1][INFO ] Running command: systemctl enable ceph.target
[ceph-1][INFO ] Running command: systemctl enable [email protected]
[ceph-1][WARNIN] Created symlink from /etc/systemd/system/ceph-mon.target.wants/[email protected] to /usr/lib/systemd/system/[email protected]
[ceph-1][INFO ] Running command: systemctl start [email protected]
[ceph-1][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-1.asok mon_status
[ceph-1][DEBUG ] ********************************************************************************
[ceph-1][DEBUG ] status for monitor: mon.ceph-1
[ceph-1][DEBUG ] {
[ceph-1][DEBUG ] "election_epoch": 3,
[ceph-1][DEBUG ] "extra_probe_peers": [],
[ceph-1][DEBUG ] "feature_map": {
[ceph-1][DEBUG ] "mon": [
[ceph-1][DEBUG ] {
[ceph-1][DEBUG ] "features": "0x3ffddff8ffecffff",
[ceph-1][DEBUG ] "num": 1,
[ceph-1][DEBUG ] "release": "luminous"
[ceph-1][DEBUG ] }
[ceph-1][DEBUG ] ]
[ceph-1][DEBUG ] },
[ceph-1][DEBUG ] "features": {
[ceph-1][DEBUG ] "quorum_con": "4611087854035861503",
[ceph-1][DEBUG ] "quorum_mon": [
[ceph-1][DEBUG ] "kraken",
[ceph-1][DEBUG ] "luminous",
[ceph-1][DEBUG ] "mimic",
[ceph-1][DEBUG ] "osdmap-prune",
[ceph-1][DEBUG ] "nautilus"
[ceph-1][DEBUG ] ],
[ceph-1][DEBUG ] "required_con": "2449958747315912708",
[ceph-1][DEBUG ] "required_mon": [
[ceph-1][DEBUG ] "kraken",
[ceph-1][DEBUG ] "luminous",
[ceph-1][DEBUG ] "mimic",
[ceph-1][DEBUG ] "osdmap-prune",
[ceph-1][DEBUG ] "nautilus"
[ceph-1][DEBUG ] ]
[ceph-1][DEBUG ] },
[ceph-1][DEBUG ] "monmap": {
[ceph-1][DEBUG ] "created": "2021-07-21 17:31:24.761362",
[ceph-1][DEBUG ] "epoch": 1,
[ceph-1][DEBUG ] "features": {
[ceph-1][DEBUG ] "optional": [],
[ceph-1][DEBUG ] "persistent": [
[ceph-1][DEBUG ] "kraken",
[ceph-1][DEBUG ] "luminous",
[ceph-1][DEBUG ] "mimic",
[ceph-1][DEBUG ] "osdmap-prune",
[ceph-1][DEBUG ] "nautilus"
[ceph-1][DEBUG ] ]
[ceph-1][DEBUG ] },
[ceph-1][DEBUG ] "fsid": "d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb",
[ceph-1][DEBUG ] "min_mon_release": 14,
[ceph-1][DEBUG ] "min_mon_release_name": "nautilus",
[ceph-1][DEBUG ] "modified": "2021-07-21 17:31:24.761362",
[ceph-1][DEBUG ] "mons": [
[ceph-1][DEBUG ] {
[ceph-1][DEBUG ] "addr": "10.101.235.84:6789/0",
[ceph-1][DEBUG ] "name": "ceph-1",
[ceph-1][DEBUG ] "public_addr": "10.101.235.84:6789/0",
[ceph-1][DEBUG ] "public_addrs": {
[ceph-1][DEBUG ] "addrvec": [
[ceph-1][DEBUG ] {
[ceph-1][DEBUG ] "addr": "10.101.235.84:3300",
[ceph-1][DEBUG ] "nonce": 0,
[ceph-1][DEBUG ] "type": "v2"
[ceph-1][DEBUG ] },
[ceph-1][DEBUG ] {
[ceph-1][DEBUG ] "addr": "10.101.235.84:6789",
[ceph-1][DEBUG ] "nonce": 0,
[ceph-1][DEBUG ] "type": "v1"
[ceph-1][DEBUG ] }
[ceph-1][DEBUG ] ]
[ceph-1][DEBUG ] },
[ceph-1][DEBUG ] "rank": 0
[ceph-1][DEBUG ] }
[ceph-1][DEBUG ] ]
[ceph-1][DEBUG ] },
[ceph-1][DEBUG ] "name": "ceph-1",
[ceph-1][DEBUG ] "outside_quorum": [],
[ceph-1][DEBUG ] "quorum": [
[ceph-1][DEBUG ] 0
[ceph-1][DEBUG ] ],
[ceph-1][DEBUG ] "quorum_age": 2,
[ceph-1][DEBUG ] "rank": 0,
[ceph-1][DEBUG ] "state": "leader",
[ceph-1][DEBUG ] "sync_provider": []
[ceph-1][DEBUG ] }
[ceph-1][DEBUG ] ********************************************************************************
[ceph-1][INFO ] monitor: mon.ceph-1 is running
[ceph-1][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-1.asok mon_status
[ceph_deploy.mon][INFO ] processing monitor mon.ceph-1
[ceph-1][DEBUG ] connected to host: ceph-1
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph-1][DEBUG ] find the location of an executable
[ceph-1][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-1.asok mon_status
[ceph_deploy.mon][INFO ] mon.ceph-1 monitor has reached quorum!
[ceph_deploy.mon][INFO ] all initial monitors are running and have formed quorum
[ceph_deploy.mon][INFO ] Running gatherkeys...
[ceph_deploy.gatherkeys][INFO ] Storing keys in temp directory /tmp/tmplLWg_z
[ceph-1][DEBUG ] connected to host: ceph-1
[ceph-1][DEBUG ] detect platform information from remote host
[ceph-1][DEBUG ] detect machine type
[ceph-1][DEBUG ] get remote short hostname
[ceph-1][DEBUG ] fetch remote file
[ceph-1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph-1.asok mon_status
[ceph-1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-1/keyring auth get client.admin
[ceph-1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-1/keyring auth get client.bootstrap-mds
[ceph-1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-1/keyring auth get client.bootstrap-mgr
[ceph-1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-1/keyring auth get client.bootstrap-osd
[ceph-1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-1/keyring auth get client.bootstrap-rgw
[ceph_deploy.gatherkeys][INFO ] Storing ceph.client.admin.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mds.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mgr.keyring
[ceph_deploy.gatherkeys][INFO ] keyring 'ceph.mon.keyring' already exists
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-osd.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-rgw.keyring
[ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmplLWg_z
生成了很多 keyring
将 keyring 推送到所有的 node 上
ceph-deploy admin ceph-1 ceph-2 ceph-3
1 | [[email protected] ceph-cluster]# ceph-deploy admin ceph-1 ceph-2 ceph-3 |
输入 ceph -s1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16[[email protected] ceph-cluster]# ceph -s
cluster:
id: d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
health: HEALTH_WARN
mon is allowing insecure global_id reclaim
services:
mon: 1 daemons, quorum ceph-1 (age 3h)
mgr: no daemons active
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
提示 mon is allowing insecure global_id reclaim
解决:1
ceph config set mon auth_allow_insecure_global_id_reclaim false
ceph -s1
2
3
4
5
6
7
8
9
10
11
12
13
14
15[[email protected] ceph-cluster]# ceph -s
cluster:
id: d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
health: HEALTH_OK
services:
mon: 1 daemons, quorum ceph-1 (age 3h)
mgr: no daemons active
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
monitor 1 个 是 ceph-1
manager 还没有
当前 cluster 目录下1
2
3
4
5
6
7
8
9
10[[email protected] ceph-cluster]# ll
total 48
-rw------- 1 root root 113 Jul 21 17:31 ceph.bootstrap-mds.keyring
-rw------- 1 root root 113 Jul 21 17:31 ceph.bootstrap-mgr.keyring
-rw------- 1 root root 113 Jul 21 17:31 ceph.bootstrap-osd.keyring
-rw------- 1 root root 113 Jul 21 17:31 ceph.bootstrap-rgw.keyring
-rw------- 1 root root 151 Jul 21 17:31 ceph.client.admin.keyring
-rw-r--r-- 1 root root 231 Jul 21 15:34 ceph.conf
-rw-r--r-- 1 root root 17956 Jul 21 21:09 ceph-deploy-ceph.log
-rw------- 1 root root 73 Jul 21 15:34 ceph.mon.keyring
将 ceph-1 作为 mgr1
ceph-deploy mgr create ceph-1
1 | [[email protected] ceph-cluster]# ceph-deploy mgr create ceph-1 |
ceph -s1
2
3
4
5
6
7
8
9
10
11
12
13
14
15[[email protected] ceph-cluster]# ceph -s
cluster:
id: d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
health: HEALTH_OK
services:
mon: 1 daemons, quorum ceph-1 (age 4h)
mgr: ceph-1(active, since 2s)
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
准备 osd
删除分区
umount
清除 /etc/fstab
fdisk /dev/sdb
d
w
ceph-deploy osd create ceph-1 –data /dev/sdb
ceph-deploy osd create ceph-2 –data /dev/sdb
ceph-deploy osd create ceph-3 –data /dev/sdb
1 | [[email protected] ceph-cluster]# ceph-deploy osd create ceph-1 --data /dev/sdb |
[ceph-1][WARNIN] ceph-volume lvm create: error: GPT headers found, they must be removed on: /dev/sdb
https://www.jianshu.com/p/d7fcf1cb5a48
该错误关键点就是 GPT headers found, they must be removed,发生原因应该是之前磁盘被分区过,虽然删掉了分区,但是还存在 GPT 数据结构,使用 sgdisk 命令进行清除。1
#sgdisk --zap-all /dev/sdX
sgdisk 我们的 yum 源没有
https://zhuanlan.zhihu.com/p/73479251
可以看到, 新增的硬盘中发现了 GPT 分区表, 所以添加失败了, 我们要手动清理掉硬盘的分区表 (当然如果硬盘是全新的, 这里应该就成功了).
这里我们直接暴力干掉分区表, 不用费事的操作 PV 和 VG 了.
注意, 一定要再三检查目标硬盘是否是期望的硬盘, 如果操作错了硬盘, 分区表直接就没了.
[[email protected] ceph-9]# dd if=/dev/zero of=/dev/sde bs=512K count=1
1+0 records in
1+0 records out
524288 bytes (524 kB) copied, 0.00109677 s, 478 MB/s
利用 dd 命令把硬盘的前 512K 填充为 0, 直接干掉分区信息.
看评论,似乎还可以
可以用 wipefs -a /dev/sde,这样更快些可以用 partprobe 让内核重新读分区表,不用重启
没有尝试过
dd if=/dev/zero of=/dev/sdb bs=512K count=1
继续
ceph-deploy osd create ceph-1 –data /dev/sdb
1 | [[email protected] ceph-cluster]# ceph-deploy osd create ceph-1 --data /dev/sdb |
这时候 ceph -s
1 | [[email protected] ceph-cluster]# ceph -s |
ceph-deploy osd create ceph-2 –data /dev/sdb
ceph-deploy osd create ceph-3 –data /dev/sdb
1 | [[email protected] ceph-cluster]# ceph -s |
此时不再显示 warn 了1
2HEALTH_WARN
OSD count 1 < osd_pool_default_size 3
查看 osd 状态1
2
3
4
5
6
7
8[[email protected] ceph-cluster]# ceph osd status
+----+--------+-------+-------+--------+---------+--------+---------+-----------+
| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |
+----+--------+-------+-------+--------+---------+--------+---------+-----------+
| 0 | ceph-1 | 1025M | 30.0T | 0 | 0 | 0 | 0 | exists,up |
| 1 | ceph-2 | 1025M | 60.0T | 0 | 0 | 0 | 0 | exists,up |
| 2 | ceph-3 | 1025M | 3904G | 0 | 0 | 0 | 0 | exists,up |
+----+--------+-------+-------+--------+---------+--------+---------+-----------+
拓展 mon 和 mgr
参考 https://docs.ceph.com/en/octopus/install/ceph-deploy/quick-ceph-deploy/#expanding-your-cluster
mon 需要高可用
mon 挂了 整个集群挂了
Paxos 算法 奇数个
ceph-deploy mon add ceph-2 ceph-3
报错1
2
3
4
5
6
7
8[ceph-2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph-2][WARNIN] ceph-2 is not defined in `mon initial members`
[ceph-2][WARNIN] monitor ceph-2 does not exist in monmap
[ceph-2][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors
[ceph-2][WARNIN] monitors may not be able to form quorum
[ceph-2][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-2.asok mon_status
[ceph-2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph-2][WARNIN] monitor: mon.ceph-2, might not be running yet
原因 ceph.conf 配置文件中缺少 public network 的配置
添加了后,把配置推送到每个节点
ceph-deploy –overwrite-conf config push ceph-1 ceph-2 ceph-3
ceph-deploy mon add ceph-21
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121[[email protected] ceph-cluster]# ceph-deploy mon add ceph-2
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy mon add ceph-2
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : add
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x22553b0>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] mon : ['ceph-2']
[ceph_deploy.cli][INFO ] func : <function mon at 0x224c2a8>
[ceph_deploy.cli][INFO ] address : None
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mon][INFO ] ensuring configuration of new mon host: ceph-2
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-2
[ceph-2][DEBUG ] connected to host: ceph-2
[ceph-2][DEBUG ] detect platform information from remote host
[ceph-2][DEBUG ] detect machine type
[ceph-2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host ceph-2
[ceph_deploy.mon][DEBUG ] using mon address by resolving host: 10.101.235.217
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-2 ...
[ceph-2][DEBUG ] connected to host: ceph-2
[ceph-2][DEBUG ] detect platform information from remote host
[ceph-2][DEBUG ] detect machine type
[ceph-2][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO ] distro info: CentOS Linux 7.4.1708 Core
[ceph-2][DEBUG ] determining if provided host has same hostname in remote
[ceph-2][DEBUG ] get remote short hostname
[ceph-2][DEBUG ] adding mon to ceph-2
[ceph-2][DEBUG ] get remote short hostname
[ceph-2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-2][DEBUG ] create the mon path if it does not exist
[ceph-2][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-2/done
[ceph-2][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph-2][DEBUG ] create the init path if it does not exist
[ceph-2][INFO ] Running command: systemctl enable ceph.target
[ceph-2][INFO ] Running command: systemctl enable [email protected]
[ceph-2][INFO ] Running command: systemctl start [email protected]
[ceph-2][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-2.asok mon_status
[ceph-2][WARNIN] ceph-2 is not defined in `mon initial members`
[ceph-2][WARNIN] monitor ceph-2 does not exist in monmap
[ceph-2][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-2.asok mon_status
[ceph-2][DEBUG ] ********************************************************************************
[ceph-2][DEBUG ] status for monitor: mon.ceph-2
[ceph-2][DEBUG ] {
[ceph-2][DEBUG ] "election_epoch": 0,
[ceph-2][DEBUG ] "extra_probe_peers": [],
[ceph-2][DEBUG ] "feature_map": {
[ceph-2][DEBUG ] "mon": [
[ceph-2][DEBUG ] {
[ceph-2][DEBUG ] "features": "0x3ffddff8ffecffff",
[ceph-2][DEBUG ] "num": 1,
[ceph-2][DEBUG ] "release": "luminous"
[ceph-2][DEBUG ] }
[ceph-2][DEBUG ] ]
[ceph-2][DEBUG ] },
[ceph-2][DEBUG ] "features": {
[ceph-2][DEBUG ] "quorum_con": "0",
[ceph-2][DEBUG ] "quorum_mon": [],
[ceph-2][DEBUG ] "required_con": "2449958197560098820",
[ceph-2][DEBUG ] "required_mon": [
[ceph-2][DEBUG ] "kraken",
[ceph-2][DEBUG ] "luminous",
[ceph-2][DEBUG ] "mimic",
[ceph-2][DEBUG ] "osdmap-prune",
[ceph-2][DEBUG ] "nautilus"
[ceph-2][DEBUG ] ]
[ceph-2][DEBUG ] },
[ceph-2][DEBUG ] "monmap": {
[ceph-2][DEBUG ] "created": "2021-07-21 17:31:24.761362",
[ceph-2][DEBUG ] "epoch": 1,
[ceph-2][DEBUG ] "features": {
[ceph-2][DEBUG ] "optional": [],
[ceph-2][DEBUG ] "persistent": [
[ceph-2][DEBUG ] "kraken",
[ceph-2][DEBUG ] "luminous",
[ceph-2][DEBUG ] "mimic",
[ceph-2][DEBUG ] "osdmap-prune",
[ceph-2][DEBUG ] "nautilus"
[ceph-2][DEBUG ] ]
[ceph-2][DEBUG ] },
[ceph-2][DEBUG ] "fsid": "d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb",
[ceph-2][DEBUG ] "min_mon_release": 14,
[ceph-2][DEBUG ] "min_mon_release_name": "nautilus",
[ceph-2][DEBUG ] "modified": "2021-07-21 17:31:24.761362",
[ceph-2][DEBUG ] "mons": [
[ceph-2][DEBUG ] {
[ceph-2][DEBUG ] "addr": "10.101.235.84:6789/0",
[ceph-2][DEBUG ] "name": "ceph-1",
[ceph-2][DEBUG ] "public_addr": "10.101.235.84:6789/0",
[ceph-2][DEBUG ] "public_addrs": {
[ceph-2][DEBUG ] "addrvec": [
[ceph-2][DEBUG ] {
[ceph-2][DEBUG ] "addr": "10.101.235.84:3300",
[ceph-2][DEBUG ] "nonce": 0,
[ceph-2][DEBUG ] "type": "v2"
[ceph-2][DEBUG ] },
[ceph-2][DEBUG ] {
[ceph-2][DEBUG ] "addr": "10.101.235.84:6789",
[ceph-2][DEBUG ] "nonce": 0,
[ceph-2][DEBUG ] "type": "v1"
[ceph-2][DEBUG ] }
[ceph-2][DEBUG ] ]
[ceph-2][DEBUG ] },
[ceph-2][DEBUG ] "rank": 0
[ceph-2][DEBUG ] }
[ceph-2][DEBUG ] ]
[ceph-2][DEBUG ] },
[ceph-2][DEBUG ] "name": "ceph-2",
[ceph-2][DEBUG ] "outside_quorum": [],
[ceph-2][DEBUG ] "quorum": [],
[ceph-2][DEBUG ] "rank": -1,
[ceph-2][DEBUG ] "state": "probing",
[ceph-2][DEBUG ] "sync_provider": []
[ceph-2][DEBUG ] }
[ceph-2][DEBUG ] ********************************************************************************
[ceph-2][INFO ] monitor: mon.ceph-2 is currently at the state of probing
ceph-deploy mon add ceph-3
ceph -s1
2
3
4
5
6
7
8
9
10
11
12
13
14
15cluster:
id: d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
health: HEALTH_WARN
clock skew detected on mon.ceph-2, mon.ceph-3
services:
mon: 3 daemons, quorum ceph-1,ceph-2,ceph-3 (age 14s)
mgr: ceph-1(active, since 9m)
osd: 3 osds: 3 up (since 27m), 3 in (since 27m)
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 3.0 GiB used, 94 TiB / 94 TiB avail
pgs:
ceph quorum_status –format json-pretty1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97[[email protected] ceph-cluster]# ceph quorum_status --format json-pretty
{
"election_epoch": 12,
"quorum": [
0,
1,
2
],
"quorum_names": [
"ceph-1",
"ceph-2",
"ceph-3"
],
"quorum_leader_name": "ceph-1",
"quorum_age": 63,
"monmap": {
"epoch": 3,
"fsid": "d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb",
"modified": "2021-07-21 23:12:49.336700",
"created": "2021-07-21 17:31:24.761362",
"min_mon_release": 14,
"min_mon_release_name": "nautilus",
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "ceph-1",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.101.235.84:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.101.235.84:6789",
"nonce": 0
}
]
},
"addr": "10.101.235.84:6789/0",
"public_addr": "10.101.235.84:6789/0"
},
{
"rank": 1,
"name": "ceph-2",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.101.235.217:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.101.235.217:6789",
"nonce": 0
}
]
},
"addr": "10.101.235.217:6789/0",
"public_addr": "10.101.235.217:6789/0"
},
{
"rank": 2,
"name": "ceph-3",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.101.235.38:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.101.235.38:6789",
"nonce": 0
}
]
},
"addr": "10.101.235.38:6789/0",
"public_addr": "10.101.235.38:6789/0"
}
]
}
}
ceph mon stat1
2[[email protected] ceph-cluster]# ceph mon stat
e3: 3 mons at {ceph-1=[v2:10.101.235.84:3300/0,v1:10.101.235.84:6789/0],ceph-2=[v2:10.101.235.217:3300/0,v1:10.101.235.217:6789/0],ceph-3=[v2:10.101.235.38:3300/0,v1:10.101.235.38:6789/0]}, election epoch 12, leader 0 ceph-1, quorum 0,1,2 ceph-1,ceph-2,ceph-3
ceph mon dump1
2
3
4
5
6
7
8
9
10[[email protected] ceph-cluster]# ceph mon dump
epoch 3
fsid d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
last_changed 2021-07-21 23:12:49.336700
created 2021-07-21 17:31:24.761362
min_mon_release 14 (nautilus)
0: [v2:10.101.235.84:3300/0,v1:10.101.235.84:6789/0] mon.ceph-1
1: [v2:10.101.235.217:3300/0,v1:10.101.235.217:6789/0] mon.ceph-2
2: [v2:10.101.235.38:3300/0,v1:10.101.235.38:6789/0] mon.ceph-3
dumped monmap epoch 3
创建 mgr
ceph-deploy mgr create ceph-2 ceph-31
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46[[email protected] ceph-cluster]# ceph-deploy mgr create ceph-2 ceph-3
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy mgr create ceph-2 ceph-3
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] mgr : [('ceph-2', 'ceph-2'), ('ceph-3', 'ceph-3')]
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0xef1c20>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function mgr at 0xe83f50>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-2:ceph-2 ceph-3:ceph-3
[ceph-2][DEBUG ] connected to host: ceph-2
[ceph-2][DEBUG ] detect platform information from remote host
[ceph-2][DEBUG ] detect machine type
[ceph_deploy.mgr][INFO ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-2
[ceph-2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-2][WARNIN] mgr keyring does not exist yet, creating one
[ceph-2][DEBUG ] create a keyring file
[ceph-2][DEBUG ] create path recursively if it doesn't exist
[ceph-2][INFO ] Running command: ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-2 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-2/keyring
[ceph-2][INFO ] Running command: systemctl enable [email protected]
[ceph-2][WARNIN] Created symlink from /etc/systemd/system/ceph-mgr.target.wants/[email protected] to /usr/lib/systemd/system/[email protected]
[ceph-2][INFO ] Running command: systemctl start [email protected]
[ceph-2][INFO ] Running command: systemctl enable ceph.target
[ceph-3][DEBUG ] connected to host: ceph-3
[ceph-3][DEBUG ] detect platform information from remote host
[ceph-3][DEBUG ] detect machine type
[ceph_deploy.mgr][INFO ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-3
[ceph-3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-3][WARNIN] mgr keyring does not exist yet, creating one
[ceph-3][DEBUG ] create a keyring file
[ceph-3][DEBUG ] create path recursively if it doesn't exist
[ceph-3][INFO ] Running command: ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-3 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-3/keyring
[ceph-3][INFO ] Running command: systemctl enable [email protected]
[ceph-3][WARNIN] Created symlink from /etc/systemd/system/ceph-mgr.target.wants/[email protected] to /usr/lib/systemd/system/[email protected]
[ceph-3][INFO ] Running command: systemctl start [email protected]
[ceph-3][INFO ] Running command: systemctl enable ceph.target
ceph -s1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16[[email protected] ceph-cluster]# ceph -s
cluster:
id: d26fd4cc-7ba1-4744-91d5-f5ccf291c5eb
health: HEALTH_WARN
clock skew detected on mon.ceph-2, mon.ceph-3
services:
mon: 3 daemons, quorum ceph-1,ceph-2,ceph-3 (age 7m)
mgr: ceph-1(active, since 16m), standbys: ceph-2, ceph-3
osd: 3 osds: 3 up (since 34m), 3 in (since 34m)
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 3.0 GiB used, 94 TiB / 94 TiB avail
pgs:
dashboard 面板
12.x luminous 以上版本提供
https://docs.ceph.com/en/nautilus/mgr/dashboard/
WebUI based on Angular/TypeScript
特性
https://docs.ceph.com/en/nautilus/mgr/dashboard/#feature-overview
- 多用户 多权限
- 支持 SSO (单一登陆)
- SSL/TLS 支持:自签证书,CA 发行的证书
- 审计:命令审计
- i18n 国际化
管理和监控功能
- 整个集群的健康
- 内嵌 grafana 面板
- 集群日志
- 主机管理
- 性能监控
- 监控
- 配置
yum install ceph-mgr-dashboard
自动安装 ceph-grafana-dashboards
ceph-grafana-dashboards 这个包里带有 grafana dashboard json
for i in {1..3};do echo $i;ssh ceph-$i “yum install -y ceph-mgr-dashboard”;done
开启 dashboard 记得要在所有 mgr 节点上安装 dashboard 包,否则会报错
就算 –force 忽略错误,如果当前 mgr 挂了,mgr 转移到别的 standby mgr 节点,你的 dashboard 将无法使用
ceph mgr module enable dashboard
SSL
ceph dashboard create-self-signed-cert1
2[[email protected] ceph-cluster]# ceph dashboard create-self-signed-cert
Self-signed certificate created
openssl req -new -nodes -x509 -subj “/O=IT/CN=ceph-mgr-dashboard” -days 3650 -keyout dashboard.key -out dashboard.crt -extensions v3_ca
1 | [[email protected] ceph-cluster]# openssl req -new -nodes -x509 -subj "/O=IT/CN=ceph-mgr-dashboard" -days 3650 -keyout dashboard.key -out dashboard.crt -extensions v3_ca |
ceph config-key set mgr mgr/dashboard/crt -i dashboard.crt
ceph config-key set mgr mgr/dashboard/key -i dashboard.key
ceph config-key set mgr/dashboard/ceph1/crt -i dashboard.crt1
2
3
4
5
6
7[[email protected] ceph-cluster]# ceph config-key set mgr mgr/dashboard/crt -i dashboard.crt
set mgr
[[email protected] ceph-cluster]# ceph config-key set mgr mgr/dashboard/key -i dashboard.key
set mgr
[[email protected] ceph-cluster]# ceph config-key set mgr/dashboard/ceph1/crt -i dashboard.crt
WARNING: it looks like you might be trying to set a ceph-mgr module configuration key. Since Ceph 13.0.0 (Mimic), mgr module configuration is done with `config set`, and new values set using `config-key set` will be ignored.
set mgr/dashboard/ceph1/crt
ceph dashboard set-ssl-certificate -i dashboard.crt
ceph dashboard set-ssl-certificate-key -i dashboard.key
ceph dashboard set-ssl-certificate ceph-1 -i dashboard.crt
ceph dashboard set-ssl-certificate-key ceph-1 -i dashboard.key
1 | [[email protected] ceph-cluster]# ceph dashboard set-ssl-certificate -i dashboard.crt |
ceph mgr services1
2
3
4[[email protected] ceph-cluster]# ceph mgr services
{
"dashboard": "https://ceph-1:8443/"
}
创建用户
n版本,需要把密码写在文本中传进去
模版
ceph dashboard ac-user-create admin -i password.txt administrator
不太懂这个操作,防止在 history 里面泄露么
部署 mds
ceph-deploy mds create ceph-1 ceph-2 ceph-3
1 | [[email protected] ceph-cluster]# ceph-deploy mds create ceph-1 ceph-2 ceph-3 |
ceph -s
1 | [[email protected] ceph-cluster]# ceph -s |
3 个 up:standy
因为当前没有文件系统
重做 单盘 raid0
https://medium.com/@george.shuklin/how-to-remove-osd-from-ceph-cluster-b4c37cc0ec87
osd invalid
- ceph osd out osd.11
- If you see “osd.11 is already out” — it’s ok.
- ceph osd down osd.11
- Remove it: ceph osd rm osd.11. If it says ‘Error EBUSY: osd.11 is still up; must be down before removal.’ that means OSD is not dead yet. Go to the host it resides on and kill it (systemctl stop [email protected]), and repeat rm operation.
- Now it would list in ceph osd tree with ‘DNE’ status (DNE = do not exists). To clean up this status, remove it from CRUSH map: ceph osd crush rm osd.11
- Last step: remove it authorization (it should prevent problems with ‘couldn’t add new osd with same number’): ceph auth del osd.