Database normalization for german cities, regions, districts etc

Question

+-------------+----------------------------+-------------------+-----------+-----------+-------------------+
| postal_code | city_district              | city              | district  | region    | state             |
+-------------+----------------------------+-------------------+-----------+-----------+-------------------+
| 60313       | Innenstadt                 | Frankfurt am Main | NULL      | Darmstadt | Hessen            |
| 10719       | Charlottenburg-Wilmersdorf | Berlin            | NULL      | NULL      | Berlin            |
| 73773       | NULL                       | Aichwald          | Esslingen | Stuttgart | Baden-Württemberg |
+-------------+----------------------------+-------------------+-----------+-----------+-------------------+

Sometimes city_district, district, and region can be NULL. Small cities have no city districts. A city (Frankfurt am Main) can include/be a district. A city (Berlin) can include/be a district and a region.

The tables are empty and will be populated from validated user input. I created for each columns its own table and joined them, like this:

postalcode: city_id
city_district: city_id
city: district_id
district: region_id
region: state_id
state: name

SELECT 
    *
FROM
postalcode
    LEFT JOIN
city ON postalcode.city_id = city.id
    LEFT JOIN
district ON city.district_id = district.id
    LEFT JOIN
region ON district.region_id = region.id
    LEFT JOIN
state ON region.state_id = state.id;

Is that so good? Over normalization? How can I join the tables with NULL values? I want to get the above table with Joins. If I want to join "citydistrict" after "postalcode" I get an error message "Not unique table/alias". How can I alias the tables correctly?

Postal code: https://en.wikipedia.org/wiki/Postal_codes_in_Germany
City district: https://en.wikipedia.org/wiki/Stadtbezirk
District: https://en.wikipedia.org/wiki/Districts_of_Germany
Region: https://en.wikipedia.org/wiki/Regierungsbezirk
State: https://en.wikipedia.org/wiki/States_of_Germany

Hi. The word "normalize" is used to mean two different things: putting into 1NF, which itself has no fixed meaning, and decomposition to higher NFs per FDs & JDs. Neither involves replacing values by ids. So why are you using that word? What reference are you following, what do you think that word means & what do you think you are trying to accomplish? Also, there are no nulls in the relational model, so *you* have to tell *us* what your reference says "normalization" is with nulls. Your decomposition/recomposition is intuitively reasonable but *why* exactly are you doing it that way? — philipxy, Dec 11 '17 at 07:28
PS So what does *any google search whatsoever* tell you about using SQL table aliases? — philipxy, Dec 11 '17 at 07:31
Re your error message & comment: Please ask one question per post. Please edit clarifications into posts, not comments. Please read & act on [mcve]. Read about SQL `select distinct`, [how SQL `cross`/`inner` `join` works](https://stackoverflow.com/a/25957600/3404097) and `natural join`. — philipxy, Dec 11 '17 at 18:32

score 0 · Answer 1 · answered Dec 11 '17 at 20:15

There are about 8K postal codes in Germany? With no normalization, the table might be about 1-2MB -- a quite small amount. Depending on what indexes you want, it may get up to 3MB.

Normalizing at every level is over-normalizing.

Is every postal_code in a city? If so, then a compromise would be to have two tables:

postal_code, city_district, city_id
city_id, city, district, region, state

Doing so might cut the disk footprint in half.

Database normalization for german cities, regions, districts etc

1 Answers1