1

Using PostgreSQL 9.4.

I have a data column titled 'Appliances'. 'Appliances' is of type character varying and is full of values that follow this patter: 'A|B|E|H' or 'C|D|E', etc. I also have a text file that explains the mappings that define what the capital letters mean, i.e. A=Dishwasher, B=Stove, C=Microwave...

I need to convert this data to a fit into a new character varying array column, such that the values become: '{Dishwasher, Stove, Fan, Television}'

The first thing I tried was stacking a whole bunch of replace calls on top of each other and concatenating with '{' and '}':

select 
'{' || 
replace(replace(replace(
replace(replace(replace(
replace(replace(replace(
  replace('A|B|C|D|E|F|G|J|I', '|', ','),
    'G', 'Refrigerator'),
    'D', 'Garbage Disposal'),
    'A', 'Dishwasher') ,
    'B', 'Double Oven'),
    'C', 'Dryer'),
    'E', 'Microwave'),
    'F', 'Range/Oven'),
    'I', 'Trash Compactor'),
    'J', 'Washer')
|| '}'

not only does this look gross, but it breaks when you switch the Refrigerator and the Garbage Disposal. Because of the 'G' in 'Garbage Disposal'. Another problem is I could conceivably add an appliance later that would make a circular replacement that would prevent this method form working at all.

SO, is there a better way to handle this situation?

lnhubbell
  • 3,304
  • 5
  • 17
  • 22

2 Answers2

1

You could try this.

create table appliances (
  code text primary key,
  dscr text not null);

insert into appliances (code, dscr)
values ('G', 'Refrigerator'),
    ('D', 'Garbage Disposal'),
    ('A', 'Dishwasher') ,
    ('B', 'Double Oven'),
    ('C', 'Dryer'),
    ('E', 'Microwave'),
    ('F', 'Range/Oven'),
    ('I', 'Trash Compactor'),
    ('J', 'Washer');

select array_agg(appliances.dscr)
from regexp_split_to_table('A|E|G', '\|')
join appliances on code=regexp_split_to_table;

Results in:

{Dishwasher,Microwave,Refrigerator}

You could left join if you want unknown codes to show as NULL, e.g.

select array_agg(appliances.dscr)
from regexp_split_to_table('A|E|G|?', '\|')
left join appliances on code=regexp_split_to_table;

Results in:

{Dishwasher,Microwave,Refrigerator,NULL}

Honestly though, you're probably better off in the long run if you split the 'A|E|G' strings into their individual codes and have a row for each entry.

BillRobertson42
  • 12,602
  • 4
  • 40
  • 57
1

Like @Bill already hinted, the proper way to do this would be normalized schema for the many-to-many relationship between appliances and houses (or whatever is holding your collection of appliances). You would implement it with three table like:

house  
appliance
house_appliance

Details:


While stuck with your current schema, there are multiple solutions possible - depending on your version of Postgres and the precise definition of what you have and what you need.

Building on this schema:

CREATE TABLE appliance (
  appliance_id "char" PRIMARY KEY
, appliance     text NOT NULL
);

INSERT INTO appliance VALUES
  ('G', 'Refrigerator')
, ('D', 'Garbage Disposal')
, ('A', 'Dishwasher') 
, ('B', 'Double Oven')
, ('C', 'Dryer')
, ('E', 'Microwave')
, ('F', 'Range/Oven')
, ('I', 'Trash Compactor')
, ('J', 'Washer')
;

CREATE TABLE house (
  house_id   serial PRIMARY KEY
, appliances text
);

INSERT INTO house(appliances) VALUES
  ('A|B|C|D|E|F|G|J|I')
, ('G|A|F')
, ('B|Z|A')  -- special case: invalid reference
, ('B|F|')   -- special case: empty after separator
, ('')       -- special case: empty string
, (NULL)     -- special case: NULL
;

Some possible solutions

(Out of many.)

To return an actual array - so the text representation is wrapped in '{}' automatically, and any special characters are escaped.

For Postgres 9.4+:

SELECT *
FROM   house h
LEFT   JOIN LATERAL (
   SELECT ARRAY (
      SELECT a.appliance
      FROM   unnest(string_to_array(h.appliances, '|'))
                  WITH ORDINALITY ha(appliance_id, ord)
      LEFT   JOIN appliance a USING (appliance_id)
      ORDER  BY ha.ord
      ) AS appl_arr
   ) a ON TRUE;

WITH ORDINALITY was introduced with Postgres 9.4. Details:

For Postgres 9.3:

SELECT *
FROM   (SELECT house_id, string_to_array(appliances, '|') AS arr FROM house) h
LEFT   JOIN LATERAL (
   SELECT ARRAY (
      SELECT a.appliance
      FROM   generate_subscripts(h.arr, 1) i
      LEFT   JOIN appliance a ON a.appliance_id = arr[i]
      ORDER  BY i
      ) AS appl_arr
   ) a ON TRUE;

LATERAL requires Postgres 9.3.
Both of these versions include a NULL value in the result for invalid or missing keys. Replace the inner LEFT JOIN with JOIN to ignore invalid or missing keys. The result still includes all rows due to the outer LEFT JOIN.

For Postgres 9.2 or older:

SELECT *
FROM   house h
LEFT   JOIN LATERAL (
   SELECT '{' || string_agg(appliance, ', ') || '}' AS appl_string
   FROM  (
      SELECT a.appliance
      FROM   generate_series(1, (length (h.appliances) + 1)/ 2) i
      LEFT   JOIN appliance a ON a.appliance_id = split_part(h.appliances, '|', i)
      ORDER  BY i
      ) sub
   ) a ON TRUE;

Assuming that keys are a single characters exclusively.
This returns a plain string, no escaping. You can have it either way ...

SQL Fiddle.

Closely related:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228