0

we are having a proxmox cluster with 3 nodes. Each node have 4 ssd and 12 hdd. My plan is to create 2 crush rules (one for ssd devices and another one for hdd devices). With these 2 rules I will create 2 pools. One ssd pool and one hdd pool.

But inside the ceph documentation I found this https://docs.ceph.com/en/latest/rados/operations/crush-map/#custom-crush-rules. I am trying to understand this rule. Would this rule be more useful for my hardware? Can somebody explain (with simple words), what this rule is doing?

Thank you so much.

Osti
  • 79
  • 1
  • 11

1 Answers1

0

The easiest way to use SSDs or HDDs in your crush rules would be these, assuming you're using replicated pools:

rule rule_ssd {
        id 1
        type replicated
        min_size 1
        max_size 10
        step take default class ssd
        step chooseleaf firstn 0 type host
        step emit
}
rule rule_hdd {
        id 2
        type replicated
        min_size 1
        max_size 10
        step take default class hdd
        step chooseleaf firstn 0 type host
        step emit
}

These rules make sure to select the desired device class (ssd or hdd) and choose any host within that selection, depending on your pool size (don't use size=2 except for testing purposes) it will choose that many hosts. So in this case the failure-domain is "host". The rule you refer to in the docs has its purpose in the name "mixed_replicated_rule". It spreads the replicas across different device classes (by the way, the autoscaler doesn't work well with mixed devices classes), I wouldn't really recommend it unless you have a good reason to. Stick to the easy ruleset and just use devices classes which usually are automatically detected when adding the drives.

eblock
  • 579
  • 3
  • 5
  • OK, but in this case the ssds are not working like a cache. With these method I have two rules. And I can create ceph pools either with the ssd rule or the hdd rule. But what I am looking for is a solution where the ssd acts like a cache and the data will be stored on the hdds. I think ceph cache tiering will do this, but if I understand correctly this will not be supported by proxmox? Thank you. – Osti Oct 17 '22 at 12:10
  • Your question does not mention anything about a cache but only how to create two pools for ssd and hdd. Please clarify your question then. I'm not familiar with proxmox but in general the ceph developers try to get rid of [cache tiering](https://docs.ceph.com/en/latest/rados/operations/cache-tiering/) although it works quite well (for us). We're using it for years now as an rbd cache. Depending on the workload you might not even need a cache tier. Instead you could use the SSDs to separate rocksDB from the main data device (HDD) and improve performance. – eblock Oct 17 '22 at 12:16
  • Oh sorry. Thats why I posted the link. Inside the documentation is written: „…mixing SSDs and HDDs in the same replicated pool…“. So I was asking how to setup one rule for ssd and hdd (two deviceclasses inside the Same rule -will this be the same as cache tiering?). – Osti Oct 17 '22 at 17:39
  • No, it is not the same. A cache tier is a different pool (ssd only with its own ruleset) that is used as an "overlay" for a hdd pool (another ruleset). That's not a mix, that's two separate pools. The mixed ruleset you refer to from the docs would allow to store some chunks on ssd, some on hdd which can result in bad performance and is also not a good idea from an autoscaler point of view. – eblock Oct 18 '22 at 06:12
  • Ok, so for my hardware you would recommend the two rules like you described in your first commend (so no cache tiering). I was thinking that there is maybe another solution with a ssd/hdd mix like Microsoft is using with there S2D. – Osti Oct 18 '22 at 15:38
  • As I already mentioned, depending on the workload you expect you could use the SSDs to separate rocksDB from the main data device (HDD). That speeds up some things, you should test different scenarios to see if performance requirements are met and get familiar with ceph and how it works. – eblock Oct 18 '22 at 16:02