0

I am struggling to come up with a regular expression which extracts everything with the pattern between "abc_" and "&"

Example 1: 

Input string: abc_s=facebook&abc_m=social&abc_c=abcd-video&clicks=jfhjkfdjfdjkfh

Output string: abc_s=facebook&abc_m=social&abc_c=abcd-video

Example 2: 

input string: abc_s=facebook&abc_m=social&gmpc_c=abcd-ideo&clicks=jfhjkfdjfdjkfh

output string: abc_s=facebook&abc_m=social&

any ideas on how to achieve this...i looked at the documentation but its not very clear for me as to how to use regexp_substr function to achieve this.

user1751356
  • 565
  • 4
  • 14
  • 33

2 Answers2

0

You can achieve it using combination of instr and substr as following:

SELECT SUBSTR(INPUT_STR,
              1, INSTR(SUBSTR(INPUT_STR, (INSTR(INPUT_STR, 'abc_', -1) + 4)), '&')) AS OUTPUT_STR
FROM YOUR_TABLE

Try it and let me know if it works for you?

Cheers!!

Popeye
  • 35,427
  • 4
  • 10
  • 31
  • basically i want to extract the entire string with the pattern abc_ until the last & (exclude the last &) if the abc_& is repeated 2 times then i want to extract that part which abc_w=123&abc_x=256 and if the abc_& is repeated 4 times i want to extract all the abc_w=123&abc_x=123&abc_y=123&abc_z=123 and the string may contain any other additional values which simply needs to be discarded. You solution isn't working :( – user1751356 Oct 25 '19 at 19:21
0

It sounds like you're trying to do a substr from "the first 'abc_'" to "the first '&' that isn't followed by 'abc_'". For the beginning part, a regular INSTR will work; for the end condition, you'll need a regexp.

set scan off
-- example data
with x as (select 'abc_s=facebook&abc_m=social&abc_c=abcd-video&clicks=jfhjkfdjfdjkfh' as s from dual
           union select 'abc_s=facebook&abc_m=social&gmpc_c=abcd-ideo&clicks=jfhjkfdjfdjkfh' from dual
           union select 'www.google.com/search?abc_source=blah&abc_medium=blahblah&abc_campaign=blahblah' from dual)
-- query
SELECT SUBSTR(s,
              INSTR(s, 'abc_'), -- start with the first 'abc_'
              regexp_instr(s, '(&[^a][^b][^c][^_])|$')-1 -- end with the first & that isn't followed by "abc_"; or the end
              ) AS OUTPUT_STR
FROM x;

Output:

abc_s=facebook&abc_m=social&abc_c=abcd-video                      
abc_s=facebook&abc_m=social
abc_source=blah&abc_medium=blahblah&abc_campaign=blahblah

As a side note, if you're trying to split the '&'-separated values into a list, try looking at one of the questions about splitting comma-separated values, but use & instead of a comma.

kfinity
  • 8,581
  • 1
  • 13
  • 20
  • i want to exact the entire string with the pattern abc_ and the delimiter is &. I tried your solution and its not giving the results i want. – user1751356 Oct 25 '19 at 17:49
  • Maybe you could edit your question to show the results you want? – kfinity Oct 25 '19 at 19:09
  • basically i want to extract the entire string with the pattern abc_ until the last & (exclude the last &) if the abc_& is repeated 2 times then i want to extract that part which abc_w=123&abc_x=256 and if the abc_& is repeated 4 times i want to extract all the abc_w=123&abc_x=123&abc_y=123&abc_z=123 and the string may contain any other additional values which simply needs to be discarded. – user1751356 Oct 25 '19 at 19:15
  • Can you give me an example string where my query doesn't do what you want? – kfinity Oct 25 '19 at 19:35
  • could u try for this url – user1751356 Oct 25 '19 at 20:31
  • input string: www.google.com/search?abc_source=blah&abc_medium=blahblah&abc_campaign=blahblah expected output string: abc_source=blah&abc_medium=blahblah&abc_campaign=blahblah – user1751356 Oct 25 '19 at 20:33
  • actually the last word may or may not contain & since that is the end of the string. My bad i dint specify this clearly. – user1751356 Oct 25 '19 at 20:33
  • Oh, of course! I updated the regexp to look for either "&nonabcstring" or the end of the string. – kfinity Oct 28 '19 at 13:03