How to achieve 3rd normal form with missing information

Question

I am developing a table that will store objects that have a location. The objects can have a specified state/region (optional), and they always have a specified country. So, the simplified starting point is:

Now, there is a functional dependency between state (uniquely identifiable with ISO code) and country, the state determines the country. This would suggest that I create a new table: State, that has a Country_id that references another table Country. The problem is that, in the data that is to be put in the database, the "State" value can be missing and therefore be NULL, so I have nothing to put into an intermediate table.

Does the presence of NULL (resulting from missing information) in the dependency between state-country mean that the 3rd normal form is impossible to achieve here and I need to reference State and Country tables separately from the object table? Or what is the best design solution for a case like this?

EDIT: This really is a question about a practical implementation of the problem here Functional dependencies in case of nulls, about which design solution would be suitable where a possible transitive functional dependency between id-state-country can contain a null value for state.

What is your 1 question? What is stopping you from determining the NF of a design or normalizing to a given NF? Where are you stuck? Otherwise this is just asking us to rewrite a textbook or Q&A. — philipxy, Sep 08 '20 at 19:19
Why can't 2 states in separate countries have the same name?--Why do you think state determines country? Even if 2 states can't have the same name, if two ids can have the same country but null state, state doesn't determine country, so why do you say it does? If you don't want to use nulls & you "have nothing to put into that table" then what is stopping you from having another table? You don't seem to be using terms correctly or following a design method. — philipxy, Sep 08 '20 at 19:46
Does this answer your question? [What to do with null values when modeling and normalizing?](https://stackoverflow.com/questions/40733477/what-to-do-with-null-values-when-modeling-and-normalizing) — philipxy, Sep 08 '20 at 19:47
A state can be identified by an ISO code, which make them uniquely determine a country, so that they can have the same names is not a problem (as long no states span more than one country...). — eli6, Sep 09 '20 at 11:55
I realized that I am having trouble dealing with the presence of null values in (possible) functional dependencies between State-Country in my example (since I need to eliminate transitive functional dependencies for the 3rd normal form), which I saw was explicitely adressed here in a more theoretical way : https://stackoverflow.com/questions/46956259/functional-dependencies-in-case-of-nulls Your link answers a more general question but was also useful for my understanding. — eli6, Sep 09 '20 at 11:57
Please [use text, not images/links, for text--including tables & ERDs.](https://meta.stackoverflow.com/q/285551/3404097). Use images only for what cannot be expressed as text or to augment text. — philipxy, Sep 09 '20 at 14:31
Your question is still not clear. Please edit to address my original comments. Where exactly are you stuck following a published textbook working on what exact design trying to do exactly what? In the illustrated table with nulls (a) saying a FD holds means you already are calling null a value & what's the problem, except (c) 2 rows with null state can have different countries so that FD doesn't hold. (Re "would suggest": Normalization doesn't involve introducing ids.) Null is a value, follow a textbook. PS "as long no states span more than one country" ISO codes are unique identifiers. — philipxy, Sep 09 '20 at 18:06

Gilbert Le Blanc · Answer 1 · 2020-09-08T09:12:17.873

I'm assuming your Object table contains other information besides state and country,

You could design the Object table like this:

Object
------
Object ID
State FK
Country FK

The State FK would be nullable. No value means no state.

You would have separate State and Country tables. Another name for these kinds of tables are value tables because a SELECT WHERE ID = key returns a value.

Here are the Country and State tables. It's up to your application to ensure that the state or region is located in the country.

Country
-------
Country ID
Country Name

State
-----
State ID
Country FK
State Name

score 1 · Answer 2 · answered Sep 08 '20 at 11:26

1

This is a rather tricky question. In practice it is often solved by have a single geographic table with the different hierarchies present in each row.

However, that is not your question. Your question is about an optional intermediate dimension. This does pose a problem for 3rd normal form. I think the canonical solution is to introduce fake "state" rows, one per country that might have cities with no states.

That is, the hierarchy is fixed as city --> state --> country. But in some cases, the "state" is really synonymous with the country.

answered Sep 08 '20 at 11:26

Gordon Linoff

1,242,037
58
646
786

Thanks. Do you mean that, in practice, most database designers would ignore the intermediate dimension and simply create a table Geography with both columns for state and country in it? – eli6 Sep 09 '20 at 12:42
@eli6 . . . In practice, this is often used -- a single detailed geography record with foreign key links to appropriate reference tables. Geography can be complicated. For instance, where I am sitting now, the city is comprised of 5 counties. Normally, counties are bigger than cities in the US. But there are other solutions as well. – Gordon Linoff Sep 09 '20 at 12:45

How to achieve 3rd normal form with missing information

2 Answers2