2

I've got a CSV with some data that looks like this:

A0A0A0,48.5674500000,-54.8432250000,Gander,NL
A0A1A0,47.0073470000,-52.9589210000,Aquaforte,NL
A0A1B0,47.3622800000,-53.2939930000,Avondale,NL

But my database is normalized such that Cities and Provinces are in separate tables, each with their own ID column.

So what's the easiest way to import this file into 3 separate tables and link the foreign keys properly?


To be more clear, the tables are

cities (id, name, province_id)
provinces (id, code, name, country_id)
postal_codes (id, code, city_id)
countries (id, code, name)
mpen
  • 272,448
  • 266
  • 850
  • 1,236
  • Are you saying that the two relevant parts of a city's whole name are in two different tables? I mean, "Gander" is in one table, and "NL" is in another? I don't think that could ever be workable in the US. There's "Franklin, Alabama", "Franklin, Ohio", "Franklin, Arkansas"--at least 30 Franklins here. (And, equally important, there's no "Franklin, Mississippi".) – Mike Sherrill 'Cat Recall' Feb 01 '11 at 19:37
  • @Catcall: No... the postal codes table has a FK to the city. The city tables contains the name *and* a FK to the province, which in turn points to the country. i.e., you're allowed to have duplicate city names in different states/provinces. – mpen Feb 01 '11 at 19:47

2 Answers2

3

Use COPY to import the csv into a temp table. Than use some INSERT INTO ... SELECT ... FROM ... to dump the data in the correct tables.

Frank Heikens
  • 117,544
  • 24
  • 142
  • 135
  • What do you mean by "temp table"? Is there such a concept in PostgreSQL or do you mean I should create a table and just delete it later? **Edit:** Nvm. Yes there is. – mpen Feb 01 '11 at 19:33
  • temp = temporary: `CREATE TEMP TABLE foo(bar INT);` – Frank Heikens Feb 01 '11 at 19:50
  • 2
    or if it is regular activity, create a permanent INBOX table that you truncate before starting each time. – Ken Downs Feb 01 '11 at 20:26
1

... my database is normalized

Doesn't appear to be. There are many issues, but the one that will trip you up in this question is, there does not seem to be correct PKs, no Unique Keys at all; so you will end up with duplicated data. Id "keys" do not prevent duplicate names, you need an unique index on name. It is not clear how you support two towns with the same name in the same province.

  1. You know you have to load three tables from the one imported table. Due to FKs, which are a Good Thing, you need to load Provinces first, then Cities, then PostalCodes. But from the look of your import file, it is cities (or towns or localities or suburbs) ... that resolution needs to be clearly identified first. There are 360 kilometres and dozens of localities between Gander and Aquaforte. What exactly constitutes a record in the file ?

  2. It may help to understand the structure on the excellent Canadian postal code system.

  3. Then you need to check what level of granularity you are storing in the Db. Apparently Cities or towns, but not suburbs, not localities. What about Counties or Parishes ? Eg _0A ___ means it is a rural area; since you are storing cities, not counties, not municipalities, you can ignore them.

Once you are clear about the granularity or resolution of the source data, and the level of resolution you want in the target tables, you can then load the import file, most probably is several waves per table. The SQL is easy.

PerformanceDBA
  • 32,198
  • 10
  • 64
  • 90