2

I'm trying to find the best solution for the given problem:

I have an Entity (lets call it Collateral) which consists of several fields. The uniqueness of this entity is defined by the composition of 4 fields (let's call them: user_id (bigint), device_id(varcha(9)), key_id(varchar(2732)), application_id(varchar(255)))

This table is being generated with hibernate. I have tried both moving this 4 fields to a separate entity (CollateralEmbeddedEntity ) to use it as embedded Id, and creating the constraint on those 4 fields within Collateral Entity:

@Table(
    name="COLLATERAL",
    uniqueConstraints=
    @UniqueConstraint(name = "comp_key", columnNames={"device_id", "application_id", "key_id", "user_id"}))

Problem is that in both cases, the fields altogether exceed the maximum allowed length of the MariaDB key:

java.sql.SQLException: Specified key was too long; max key length is 3072 bytes

Changing the dbCharset encoding (collation), or shrinking the fields varchar range itself is not an option.

What I have thought about is to generate and store a hash of those 4 fields and give it a unique constraint (search and update will always be based on these 4 fields all together anyway) however, I'm not sure if such solution is appropriate since we are violating the database normalization with redundant information.

Is the solution with hash is actually a good one? If not, what are the better alternatives for the given problem?

  • What is `key_id`? Why is it so long? Would it make sense to 'normalize' it and use a surrogate `INT` in its place? Is the question really about adding the `UNIQUEness` constraint? Or about having what seems to be the optimal `INDEX`? Let's see the `SELECT` that benefits from it. – Rick James Jun 24 '19 at 15:07
  • Let's say that key_id is a base64UrlEncoded certificate key. Therefore only reasonable option is to store is as String i java, which means varchar on DB site. Uniqueness is the most important part because only thoes for fields combined can determine if its new entry or not. Therefore every operation(select, delete, update) should be based on this quadruple. That is why I ask for the "best" solution. – Przemysław Gęsieniec Jun 24 '19 at 16:33
  • So, the cert key is up to 2049 bytes? – Rick James Jun 24 '19 at 16:35
  • No, it is exacly up to 2732. Certificate can be up to 2049, therefore b64 of it can reach 4/3 * 2049 = 2732. See https://stackoverflow.com/q/13378815/6834044 – Przemysław Gęsieniec Jun 24 '19 at 18:54

1 Answers1

2

Normalize the certificate key:

CREATE TABLE CertKeys (
    cert_id INT UNSIGNED AUTO_INCREMENT,
    cert_key VARCHAR(2732) NOT NULL,   -- base64 encoded
    -- or:  cert_key VARBINARY(2049) NOT NULL,   -- binary
    PRIMARY KEY (cert_id),
    UNIQUEY (cert_key) ) ENGINE=InnoDB;

Then use cert_id in the other table and in the composite INDEX you are talking about.

It takes an extra step to insert the cert_key in the new table and get the cert_id. This is done before inserting in the main table.

It's less critical, but you might also consider normalizing application_id.

(Yes, a different technique could be devised using a hash, but I think this is cleaner.)

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thank for another solution. However, isn't it more efficient to generate a hash on server site based on these 4 fields, and use this hash for performing operations to DB, rather than always e.g. search with the composite key? I do not know the performance numbers, but it feels more intuitive that for instance search by one field should be faster than by 4 (whereas the cost of generation hash is negligible for server performance). – Przemysław Gęsieniec Jun 25 '19 at 05:16
  • 1
    @PrzemysławGęsieniec - It depends on your access patterns and table size. Hashes (and UUIDs, etc) are very random, so caching tends not to be useful for large tables. But, for a modestly sized table, sure, use a hash. – Rick James Jun 25 '19 at 17:35