0

Say I have a DB with one table and a list of all citizens in U.S. - around 400 mill. Let's say I have around 10 column in each row and one of them is Age.

Now, if I wanna select all citizens in ages of 20-30 - I will need to go through all the list till the end - is there any other way around it? somehow?

I read about DB normalization and as far as I got it, I can't(won't gain any efficiency) if I'll separate the age into a different column etc.

So is there really nothing to do with things like this, but go from top to bottom each time?

  • Possible duplicate of [MySQL indexes - what are the best practices?](https://stackoverflow.com/questions/3049283/mysql-indexes-what-are-the-best-practices) – manveti Feb 07 '19 at 20:33
  • 1
    Don't store "age", store "date_of_birth". To find people ages 20-30 (as of today), we can find rows that satisfy the condition: `t.date_of_birth >= DATE(NOW()) + INTERVAL -30 YEAR AND t.date_of_birth < DATE(NOW()) + INTERVAL -20 YEAR`. (we might need to add or subtract a year... the design goal is to generate an earliest birth date and latest birth date, and get all rows in that range. If this is the only predicate in the query, then we probably want to have available an index with `date_of_birth` as the leading column. – spencer7593 Feb 07 '19 at 20:38
  • Thanks for the reading material - I'm checking it. @spencer7593, but that's the whole point: I've read in normalization that if the data is attached(depended) you should not seperate it. So the Age(or date of birth) is indeed depended on the citizen... and this is what confuse me in this particular situation – mikemiller Feb 07 '19 at 20:42
  • `date of birth` is a single valued attribute of the `citizen` entity. So `date_of_birth` should be implemented as a column on the `citizen` table. Done. (We can implement child tables for multi-valued attributes, we don't do that for single values attributes.) Since "age" is a derived attribute, for the use case described in the question, do NOT store the age attribute i.e. do not create an `age` column. The primary way that indexes improve query performance is as a DBMS mechanism to eliminate vast swaths of rows from being examined. – spencer7593 Feb 07 '19 at 20:48

0 Answers0