3

I'm trying to split one column into up to five around the " > " delimiter but the things I've tried haven'tw orked:

I tried

select
id, 
compoundColumn,
split(compoundColumn," > ")[1] as "first"
split(compoundColumn," > ")[2] as "second"
from table
where compoundColumn is not null

which didn't work, and

this which sort of did (the first part anyway, not the nth part)

select
id, 
compoundColumn,
first(split(compoundColumn," > ")) as "first"
nth(compoundColumn," > ")[n] as "second"
from table

I've found lots of examples on here but they all seem to be saying to use the brackets but the brackets throw an error:

Exception: Malformed SQL. More information: Error with SQL statement: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[1] as "first" from table where compoundColumn IS NOT NULL' at line 3.

J. G.
  • 1,922
  • 1
  • 11
  • 21
  • Not too familiar with MySQL but I don't think you need quotes around your column aliases ("first" and "second"). – ekolis Jul 24 '19 at 23:11
  • What version of MySQL are you using? – Rob Jul 25 '19 at 00:14
  • its cloudsql I think. (hard to tell, it is what is baked into this specific part of google app maker). Or alternately legacy sql if I want to include it as part of my Bigquery pull. – J. G. Jul 25 '19 at 15:38
  • Send some sample data and also the data you expect, please. – Saeid Amini Jul 25 '19 at 22:12

3 Answers3

2
  • there's missing comma after "first" in your SQL
  • I guess CloudSQL is based on some old version of MySQL which can split only using substring_index (see query below - yes it's verbose and clumsy, case clause must sanitize short strings)
  • perhaps try brackets with [offset(0)] or [ordinal(1)], that's what works for us, though we use Postgres dialect, also as #standardSql, not #legacySql

SQL from second point: (fiddle)

select id,
  case when substring_index(cc,' > ',0) = cc then null else substring_index(substring_index(cc,' > ',1),' > ',-1) end as a1,
  case when substring_index(cc,' > ',1) = cc then null else substring_index(substring_index(cc,' > ',2),' > ',-1) end as a2,
  case when substring_index(cc,' > ',2) = cc then null else substring_index(substring_index(cc,' > ',3),' > ',-1) end as a3,
  case when substring_index(cc,' > ',3) = cc then null else substring_index(substring_index(cc,' > ',4),' > ',-1) end as a4,
  case when substring_index(cc,' > ',4) = cc then null else substring_index(substring_index(cc,' > ',5),' > ',-1) end as a5
from d
Tomáš Záluský
  • 10,735
  • 2
  • 36
  • 64
0

I finally got where I needed to go using regexp extract, in the bigquery pull rather than within appmaker:

SELECT 
  CompoundColumn,

  REGEXP_EXTRACT(CompoundColumn+">",  r'^(.*?)>') first_number,
  REGEXP_EXTRACT(CompoundColumn+">",  r'^(?:(?:.*?)>){1}(.*?)>') second_number,
  REGEXP_EXTRACT(CompoundColumn+">", r'^(?:(?:.*?)>){2}(.*?)>') third_number,
  REGEXP_EXTRACT(CompoundColumn+">",  r'^(?:(?:.*?)>){3}(.*?)>') fourth_number
FROM
  myTable
WHERE
  CompoundColumn IS NOT NULL

The +">" part of the code is ugly but I couldn't get it to match strings that didn't end with the bracket (">?" breaks the whole thing) so I just made them all end with the bracket.

J. G.
  • 1,922
  • 1
  • 11
  • 21
0

The required Legacy SQL would be:

SELECT id, 
       compoundColumn,
       FIRST(SPLIT(compoundColumn, " > ")) AS "first",
       NTH(2, SPLIT(compoundColumn, " > ")) AS "second"
FROM table

See this BigQuery documentation page for more info on the SPLIT, FIRST and NTH functions.

Steve Chambers
  • 37,270
  • 24
  • 156
  • 208