My project https://simbiat.ru/fftracker has a table, that is a junction table containing IDs of achievements earned by each character ID. It works fine, but my concerns is its size: before optimization it was 6.1 GBs and after it - 4 GBs. And it's not the size of the index
Data 3.1 GiB
Index 901.5 MiB
Overhead 2.5 MiB
Effective 4.0 GiB
Total 4.0 GiB
I do not see any performance impact of it now, but I am concerned about the size, especially since the table is updated frequently and grows larger and larger. Even though it has just 3 columns (definition generated by phpMyAdmin):
CREATE TABLE `ff__character_achievement` (
`characterid` int(10) UNSIGNED NOT NULL COMMENT 'Character ID taken from Lodestone URL (https://eu.finalfantasyxiv.com/lodestone/character/characterid/)',
`achievementid` smallint(5) UNSIGNED NOT NULL COMMENT 'Achievement ID taken from Lodestone (https://eu.finalfantasyxiv.com/lodestone/character/characterid/achievement/detail/achievementid/)',
`time` date NOT NULL COMMENT 'Date when achievement was received according to Lodestone'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED;
ALTER TABLE `ff__character_achievement`
ADD PRIMARY KEY (`characterid`,`achievementid`) USING BTREE,
ADD KEY `ach` (`achievementid`) USING BTREE;
ALTER TABLE `ff__character_achievement`
ADD CONSTRAINT `char_ach_ach` FOREIGN KEY (`achievementid`) REFERENCES `ff__achievement` (`achievementid`) ON DELETE CASCADE ON UPDATE CASCADE,
ADD CONSTRAINT `char_ach_char` FOREIGN KEY (`characterid`) REFERENCES `ff__character` (`characterid`) ON DELETE CASCADE ON UPDATE CASCADE;
COMMIT;
Considering that original tables are 408KBs and 707MBs and fill with lots of other data (varchars and text even), I would expect the junction table to be 2GBs with indexes at most. Yes, I can remove the date column, but it does not help much: 2.8 instead of 3.1GBs. Which seems weird to me, but that's the values reported by MariaDB.
The purpose of the table is to dynamically determine the rarest achievements as seen on https://simbiat.ru/fftracker/statistics/achievements/ and also generate random list of characters that have the achievements as seen on https://simbiat.ru/fftracker/achievement/1/ (as example). I think that for this purpose 4GBs for a table is too excessive.
So my question is, whether there may be some kind of way to store the same data and easily utilize it for the same purpose, but have smaller footprint?
From what I know it may be possible to store achievement IDs for each character as a JSON string, but I will definitely lose CONSTRAINT (which I prefer to retain) and I do not see an easy to use way to search through respective JSONs, unless doing a WHERE or FULLTEXT search, which does not seem to be appropriate for this.
Any suggestion will be appreciated.