I'm importing a CSV into a database that contains multiple entities: City
, State
, and Country
.
I'd like to have a hierarchy of references using foreign keys across the tables:
City.State_Ref
->State.ID
State.Country_Ref
->Country.ID
So the City
table will have the following schema in the end:
CREATE TABLE "City" (
"ID" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
"Name" INTEGER NOT NULL,
"State_Ref" INTEGER,
FOREIGN KEY("State_Ref") REFERENCES "State"("ID")
)
Setting up the State
and Country
tables is easy enough, but when there is the same State
name that occurs in different Country
records, I'm unable to pick the correct one:
UPDATE City SET State_Ref = (SELECT ID from State WHERE City.State = State.Name);
This will match multiple records and LIMIT
just grabs the first one (which isn't always right).
I have also tried JOIN
to create a new table with the correct references:
SELECT City.Name as Name, State.ID as State_Ref
FROM City
INNER JOIN State ON City.State = State.Name
INNER JOIN Country ON City.Country = Country.Name
ORDER BY City.Name ASC;
But this results in multiple records for all of the State
duplicates.
Sample data (the problem happens with "Wonderland" since "Colorado" exists in multiple countries):
Country State Name
United States New York Kingston
United States Texas Paris
England Kent Kingston
United States Maine Paris
France Île-de-France Paris
United States Colorado Denver
Canada Colorado Wonderland
Any pointers for setting State_Ref
properly would be much appreciated!