1

How do I recast the entire column/all rows of data with Postgres type integer[] arrays inside of them to type integer?

Below, user_email_id is type integer[]

\d emails
    Column     |           Type
---------------+------------------------
 id            | integer
 user_id       | integer
 user_email_id | integer[]

SELECT id, user_id, user_email_id FROM emails;

 id | user_id | user_email_id
----+---------+---------------
 65 |       1 | {98,110}
 66 |       1 | {99}

For the row with user_email_id={99}, this works:

ALTER table emails 
ALTER COLUMN user_email_id type integer 
USING user_email_id[1]::INTEGER;

Expected output for id=66;

\d emails
    Column     |           Type
---------------+------------------------
 id            | integer
 user_id       | integer
 user_email_id | integer

SELECT id, user_id, user_email_id FROM emails where id=66;

 id | user_id | user_email_id
----+---------+---------------
 66 |       1 | 99

But what would one do where there are two values inside the array {98,110}? In this case, I guess I'd either have to cast to a string, or have to create two rows for the record where id=65;?

tandy
  • 1,931
  • 4
  • 24
  • 28
  • 1
    Would you expect the `id=65` row to end up as `user_email_id=98`, `user_email_id=110`, or two rows? And if it was two rows, would they have the same `id`, generate one new `id`, or two new `ids? What about foreign keys that reference that `id`? – mu is too short May 24 '18 at 17:33
  • @muistooshort I think there are only two solutions: 1) create N rows based on size of the user_email_id len() per row. 2) cast as string, concatenating all current values? If two rows, they can have new ids. There are no FKs on that id -- [I don't think you can create FKs on Postgres Arrays.](https://stackoverflow.com/questions/41054507/postgresql-array-of-elements-that-each-are-a-foreign-key) – tandy May 24 '18 at 18:12
  • Sorry but (2) doesn't make sense to me, that would be going backwards to a known anti-pattern (i.e. "CSV in a column"). Yes, you can't (AFAIK) FK from an array but you can certainly have FKs that reference rows (such as `id=65` in your sample data) that contain arrays. – mu is too short May 24 '18 at 18:26
  • @muistooshort -- Yah, #2 is not good. I see what you mean about FKs. I kept FK references off this table, since I was intending for the integer[] col to be appended to and I was testing the array type – tandy May 24 '18 at 18:29

1 Answers1

2

I don't think you want to try to mash the multi-element arrays into a string or anything like that, "CSV in a column" is a well know anti-pattern that only leads to pain and suffering. You'd be better off converting the multi-element arrays to multiple rows. In the comments you say that there are no FKs referencing this table so you don't have to worry about creating new rows.

You could do it in steps. First expand the arrays to separate columns using unnest and a lateral join:

insert into emails (user_id, user_email_id)
select user_id, array[eid]
from emails,
     unnest(user_email_id) as dt(eid)
where array_length(user_email_id, 1) > 1

That will give you something like this:

 id | user_id | user_email_id
----+---------+---------------
 65 |       1 | {98,110}
 66 |       1 | {99}
 X1 |       1 | {98}
 X2 |       1 | {110}

where the X1 and X2 rows came from 65.

Then get rid of the rows that you just expanded:

delete from emails
where array_length(user_email_id, 1) > 1

to get:

 id | user_id | user_email_id
----+---------+---------------
 66 |       1 | {99}
 X1 |       1 | {98}
 X2 |       1 | {110}

And finally, an ALTER COLUMN with a USING clause to replace the arrays:

alter table emails
alter column user_email_id
type int using user_email_id[1]

That would leave you with:

 id | user_id | user_email_id
----+---------+---------------
 66 |       1 | 99
 X1 |       1 | 98
 X2 |       1 | 110
mu is too short
  • 426,620
  • 70
  • 833
  • 800