Split a single sql column into five

Question

I'm trying to split one column into up to five around the " > " delimiter but the things I've tried haven'tw orked:

I tried

select
id, 
compoundColumn,
split(compoundColumn," > ")[1] as "first"
split(compoundColumn," > ")[2] as "second"
from table
where compoundColumn is not null

which didn't work, and

this which sort of did (the first part anyway, not the nth part)

select
id, 
compoundColumn,
first(split(compoundColumn," > ")) as "first"
nth(compoundColumn," > ")[n] as "second"
from table

I've found lots of examples on here but they all seem to be saying to use the brackets but the brackets throw an error:

Exception: Malformed SQL. More information: Error with SQL statement: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[1] as "first" from table where compoundColumn IS NOT NULL' at line 3.

Not too familiar with MySQL but I don't think you need quotes around your column aliases ("first" and "second"). — ekolis, Jul 24 '19 at 23:11
its cloudsql I think. (hard to tell, it is what is baked into this specific part of google app maker). Or alternately legacy sql if I want to include it as part of my Bigquery pull. — J. G., Jul 25 '19 at 15:38

score 2 · Answer 1 · answered Jul 26 '19 at 11:42

there's missing comma after "first" in your SQL
I guess CloudSQL is based on some old version of MySQL which can split only using substring_index (see query below - yes it's verbose and clumsy, case clause must sanitize short strings)
perhaps try brackets with [offset(0)] or [ordinal(1)], that's what works for us, though we use Postgres dialect, also as #standardSql, not #legacySql

SQL from second point: (fiddle)

select id,
  case when substring_index(cc,' > ',0) = cc then null else substring_index(substring_index(cc,' > ',1),' > ',-1) end as a1,
  case when substring_index(cc,' > ',1) = cc then null else substring_index(substring_index(cc,' > ',2),' > ',-1) end as a2,
  case when substring_index(cc,' > ',2) = cc then null else substring_index(substring_index(cc,' > ',3),' > ',-1) end as a3,
  case when substring_index(cc,' > ',3) = cc then null else substring_index(substring_index(cc,' > ',4),' > ',-1) end as a4,
  case when substring_index(cc,' > ',4) = cc then null else substring_index(substring_index(cc,' > ',5),' > ',-1) end as a5
from d

Alas legacy sql is not allowing me to use substring :/. – J. G. Jul 29 '19 at 15:28 — J. G., Jul 29 '19 at 15:28

score 0 · Accepted Answer · answered Jul 29 '19 at 15:55

I finally got where I needed to go using regexp extract, in the bigquery pull rather than within appmaker:

SELECT 
  CompoundColumn,

  REGEXP_EXTRACT(CompoundColumn+">",  r'^(.*?)>') first_number,
  REGEXP_EXTRACT(CompoundColumn+">",  r'^(?:(?:.*?)>){1}(.*?)>') second_number,
  REGEXP_EXTRACT(CompoundColumn+">", r'^(?:(?:.*?)>){2}(.*?)>') third_number,
  REGEXP_EXTRACT(CompoundColumn+">",  r'^(?:(?:.*?)>){3}(.*?)>') fourth_number
FROM
  myTable
WHERE
  CompoundColumn IS NOT NULL

The +">" part of the code is ugly but I couldn't get it to match strings that didn't end with the bracket (">?" breaks the whole thing) so I just made them all end with the bracket.

score 0 · Answer 3 · answered Jul 31 '19 at 16:04

The required Legacy SQL would be:

SELECT id, 
       compoundColumn,
       FIRST(SPLIT(compoundColumn, " > ")) AS "first",
       NTH(2, SPLIT(compoundColumn, " > ")) AS "second"
FROM table

See this BigQuery documentation page for more info on the SPLIT, FIRST and NTH functions.

Split a single sql column into five

3 Answers3