The answer from metdos helps with the underlying problem (slow data migrations), but it looks like you still want a definitive answer to the original question of "Does Citus expose the hash function it uses?"
The answer to this question is "No, not directly, but it does expose the cached information about each distributed table and you can use that to discover the hash function, which you'd just need to call". What follows is a sketch of how to do that…
The function DistributedTableCacheEntry
takes a table's identifier as its input and returns a struct
populated with the hash function which would be used for that table.
It's a public function, and exposed by the headers installed by Citus, so you should be able to link against it to write a C-level PostgreSQL function to hash a partition value given the table it belongs in. See FastShardPruning
for how to use it.
The signature would probably look like: CREATE FUNCTION citus_hash(distrel regclass, anyelement partitionval) RETURNS integer
. Pseudocode:
- Call
DistributedTableCacheEntry
with distrel
as argument
- Ensure the table is hash-partitioned
- Get the hash function from the cache entry
- Ensure
partitionval
is of the expected type
- Call the hash function on
partitionval
and return the result
See PostgreSQL's own documentation to learn about writing such a function.