For an owncloud (or nextcloud) project we need to add a great amount of storage, I've been checking all options such as: CEPH, Openstack Swift/Cinder, GlusterFS, SDFS and Tahoe-lafs.
With this service we expect many of the same files to be added by users, that is why deduplication is quite important for us. So far the only solutions for deduplication of clustered storage data would be SDFS and Tahoe-lafs. However our concerns are these two are Java and Python and will hurt CPU to much. (*Yes deduplication will likely mean more RAM and CPU as well)
Perhaps one of you have a better solution? *deduplication filesystem (e.g. ZSF) will not work as data is stored on multiple machines (HA Cluster).