I want to replace 1Tb with 2Tb. In my small Ceph cluster.
Pool configured with 3x replica.
Added a new drive.
Did out
for 2 OSDs.
Did ceph osd reweight-by-utilization
. It was some rebalancing process.
Did down
for 2 OSDs.
But in kubernetes they change their status from down to up after 5 minutes and receive data again.
I have a few questions and ask for help and advice.
- Can I delete osd.0 and osd.1 without losing data?
- Why if I disabled these 2 osds they still have 313 PGs? It should be so? According to my understanding, if I disable OSDs then PG should be 0.
- Due to the fact that I disabled these 2 OSDs, I have exactly 33.34 objects with the degraded status.
- If I delete these 2 osds, then the cluster will remap me again and everything will be all right?
- Why is there so much data on the osd.0, which was the very first one, and they are not rebalanced to other disks in any way? The fact is that initially there were 3 OSDs of 1Tb each, and now we are replacing everything with 2Tb.
- Why such a big difference in the balance between OSDs? Between the first and the second, between the first, the second and the rest?
Ceph Status:
ceph -s
cluster:
id: 995ea7a6-9287-4e97-862e-64cf4e21213f
health: HEALTH_OK
services:
mon: 3 daemons, quorum c,e,f (age 4d)
mgr: b(active, since 4d), standbys: a
mds: 1/1 daemons up, 1 hot standby
osd: 6 osds: 6 up (since 2d), 4 in (since 2d); 313 remapped pgs
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 13 pools, 313 pgs
objects: 255.11k objects, 976 GiB
usage: 1.9 TiB used, 5.9 TiB / 7.8 TiB avail
pgs: 255112/765336 objects misplaced (33.333%)
313 active+clean+remapped
io:
client: 2.7 KiB/s rd, 101 KiB/s wr, 3 op/s rd, 9 op/s wr
ceph balancer status
{
"active": true,
"last_optimize_duration": "0:00:00.001511",
"last_optimize_started": "Tue Aug 15 12:23:18 2023",
"mode": "upmap",
"optimize_result": "Unable to find further optimization, or pool(s) pg_num is decreasing, or distribution is already perfect",
"plans": []
}
ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 9.58896 root default
-5 9.58896 region n1
-4 9.58896 zone n1-d3
-3 0.79999 host 6d855885b8-z8bj2
0 ssd 0.79999 osd.0 up 0 1.00000
-11 3.90619 host 7b5fb4c8b8-cc9kp
2 ssd 1.95309 osd.2 up 1.00000 1.00000
3 ssd 1.95309 osd.3 up 1.00000 1.00000
-9 4.88278 host 7b5fb4c8b8-sqx5g
1 ssd 0.97659 osd.1 up 0 1.00000
4 ssd 1.95309 osd.4 up 1.00000 1.00000
5 ssd 1.95309 osd.5 up 1.00000 1.00000
Thanks!