-1

Given two comma separated (un-ordered) lists of numbers, I want to extract only the differences between them (using regexp probably). e.g.:

select '1010484,1025781,1051394,1069679' as list_1, '1005923,1010484,1025781,1034010,1044261,1048311,1051394' as list_2

What I wish for is a result such as:

l1_additional_data: 1069679

l2_additional_data: 1005923,1034010,1044261,1048311

How can this be done?

I'm using Vertica, BTW - That means that no hierarchic ("connect by") queries could be used here.

Thanks in advance!

goidelg
  • 316
  • 2
  • 16
  • Have you tried to do something or you are just asking someone to solve your problem? – NikNik Mar 15 '17 at 10:46
  • Why such a harsh attitude? Yes, I did try - but the lack of Connect By in Vertica, made this a much more difficult task – goidelg Mar 15 '17 at 11:39
  • Maybe I exaggerated but post what you tried to do with your output. Try to search for the answer on google/stack and after this post your question. – NikNik Mar 15 '17 at 11:49
  • I tried to use Vertica's SPLIT_PART, but once again - I need the position parameter to be a variable, not a constant – goidelg Mar 15 '17 at 12:32

2 Answers2

0

There's a relevant post that will be helpful - Splitting string into multiple rows in Oracle

I don't know vertica but based on oracle You could go with:

with list1 as
(
select 
regexp_substr(list_1 ,'[^,]+', 1, level) as list_1_rows
from  (
select
'1010484,1025781,1051394,1069679' as list_1
from dual)
connect by 
  regexp_substr(list_1 ,'[^,]+', 1, level) is not null),

list2 as (select 
regexp_substr(list_2 ,'[^,]+', 1, level) as list_2_rows
from  (
select
'1005923,1010484,1025781,1034010,1044261,1048311,1051394' as list_2
from dual)
connect by regexp_substr(list_2 ,'[^,]+', 1, level) is not null) 

select * from list1
full outer join list2 
on list1.list_1_rows = list2.list_2_rows
where list_1_rows is null or list_2_rows is null
Community
  • 1
  • 1
SmartDumb
  • 348
  • 3
  • 15
0

OK, Here's my solution - But it's not very efficient, and it probably won't scale (in terms of performance):

WITH lists AS (SELECT'1010484,1025781,1051394,1069679' AS list_1, '1005923,1010484,1025781,1034010,1044261,1048311,1051394' AS list_2 ) , numbers AS (SELECT row_number() over() i FROM system_columns limit 100) SELECT group_concat(parsed_code_1) list_1_additions, group_concat(parsed_code_2) list_2_additions FROM(SELECT parsed_code_1 FROM(SELECT split_part(list_1, ',', i) parsed_code_1 FROM lists CROSS JOIN numbers WHERE i <= regexp_count(list_1, ',')+1) l WHERE parsed_code_1 IS NOT NULL) a FULL OUTER JOIN (SELECT parsed_code_2 FROM(SELECT split_part(list_2, ',', i) parsed_code_2 FROM lists CROSS JOIN numbers WHERE i <= regexp_count(list_2, ',')+1) l WHERE parsed_code_2 IS NOT NULL) b ON(parsed_code_1 = parsed_code_2) WHERE parsed_code_1 IS NULL OR parsed_code_2 IS NULL

goidelg
  • 316
  • 2
  • 16