0

Please excuse my ignorance. I'm certain this is a FAQ, but I don't know the terminology well enough to know what to look for.

My company uses the following structure in terms of territory (example following):

  Customer -> Market -> Area -> District -> Region
  XYZ Co. -> Queens -> NYC -> Mid Atlantic -> Northeast

Each customer has only one market. Each market has only one district, and so forth. (I'm not sure if you'd call that one-to-many or many-to-one. I don't want to label it incorrectly).

This is how I have things set up right now:

create table region(
id           int  not null primary key,
name         varchar(24)
);

create table district(
id           int  not null primary key,
name         varchar(24),
region_id    int  references region(id) on update cascade
);

create table area(
id           int  not null primary key,
name         varchar(24),
district_id  int  references district(id) on update cascade
);

create table market(
id           int  not null primary key,
name         varchar(24),
area_id      int  references area(id) on update cascade
);
create table customer(
id         int  not null primary key,
name       varchar(32),
sixweekavg numeric, 
market_id  int  references market(id) on update cascade
);

Right now I have an opportunity to improve that setup as I'm more or less rewriting the site. I looked at this popular page: What are the options for storing hierarchical data in a relational database? And I'm sure that my best scenario lies there, but I don't know enough to figure out which one.

It's a reporting site, so there are way more reads than writes. Some of my pages show aggregated data at each level, customer through region (and top, too). So right now on a page that shows district-level data I would write something like:

select d.name, sum(sixweekavg) as avg from customer c
  inner join market m on m.id = c.market_id
  inner join area a on a.id = m.area_id
  inner join district d on d.id = a.district_id
  group by d.name order by d.name;

Pretty standard stuff, right? I'm sure a whole separate conversation could be had about materialized views, but for now I'd like to explore a better option for structuring the hierarchy (if that's even the correct term for this).

So given the following summary

  • PostgreSQL (it can be assumed this will not change)
  • Fixed hierarchy (my employer may at some point add or remove a tier, but every row in the customer table will always have the same number of "parents")
  • Significantly more reads than writes

Is there one method that may be better than the others for setting this up?

ltree

I did look at ltree, but I'm not quite sure how that would work. On pages where a user can select a district, for example, I query the district table for the names of each district. I had the idea to add an ltree column in my customers table which would hold the hierarchy, but still maintain the other tables. Is that a feasible and reasonable approach? I've searched for real-world examples of ltree but came up short - most that I found were designed for a random number of parent/child nodes, like a threaded comment section.

I appreciate your help and your patience!

0 Answers0