1

I have a table Table_1 on Google BigQuery which includes a string column str_column. I would like to write a SQL query (compatible with Google BigQuery) to extract all numerical values in str_column and append them as new numerical columns to Table_1. For example, if str_column includes first measurement is 22 and the other is 2.5; I need to extract 22 and 2.5 and save them under new columns numerical_val_1 and numerical_val_2. The number of new numerical columns should ideally be equal to the maximum number of numerical values in str_column, but if that'd be too complex, extracting the first 2 numerical values in str_column (and therefore 2 new columns) would be fine too. Any ideas?

khemedi
  • 774
  • 3
  • 9
  • 19

1 Answers1

3

Consider below approach

select * from (
  select str_column, offset + 1 as offset, num
  from your_table, unnest(regexp_extract_all(str_column, r'\b([\d.]+)\b')) num with offset
)
pivot (min(num) as numerical_val for offset in (1,2,3))    

if applied to sample data like in your question - output is

enter image description here

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
  • Thank you very much, this is great! – khemedi Feb 01 '22 at 20:35
  • So I've been using this script and I'm not sure if this solution is producing consistent results. It returns a new table with 4 columns: ```str_column```, ```numerical_val_1```, ```numerical_val_2``` and ```numerical_val_3```. However, the number of rows in this new table are not equal to the number of rows in original table, ```your_table```. Also, I'd want the new extracted numerical columns to be concatenated to the original table. I tried using ```select A.*, offset + 1 ...``` with ```A``` being ```your_table``` and it returns a new table with less rows than in ```your_table```. – khemedi Feb 02 '22 at 19:36
  • I think, you posted simplified data/schema sample in your question. so you've got the answer for it. Now, you obviously need to adopt above solution for your real data/use-case. If you have problem with it - please post new question with all relevant details, samples and I will be happy to help further :o) – Mikhail Berlyant Feb 02 '22 at 19:43
  • I created another post and included more details for a specific problem I'm trying to solve. Here is the link to my new question. I appreciate your help :) https://stackoverflow.com/questions/70976824/how-to-extract-all-numerical-values-from-a-string-column-in-a-big-query-table-an – khemedi Feb 03 '22 at 18:59
  • i will check it later in a day :o) – Mikhail Berlyant Feb 03 '22 at 19:05