I can't seem to find any examples of anyone doing this on the web, so am wondering if maybe there's a reason for that (or maybe I haven't used the right search terms). There might even already be a term for this that I'm unaware of?
To save on database storage space for regularly reoccurring strings, I'm thinking of creating a MySQL table called unique_string. It would only have two columns:
- "id" : INT : PRIMARY_KEY index
- "string" : varchar(255) : UNIQUE index
Any other tables anywhere in the database can then use INT columns instead of VARCHAR columns. For example a varchar field called browser would instead be an INT field called browser_unique_string_id.
I would not use this for anything where performance matters. In this case I'm using it to track details of every single page request (logging web stats) and an "audit trial" of user actions on intranets, but other things potentially too.
I'm also aware the SELECT queries would be complex, so I'm not worried about that. I'll most likely write some code to generate the queries to return the "real" string data.
Thoughts? I feel like I might be overlooking something obvious here.
Thanks!