-1

I'm trying to create hive table with multi-delimiter (I can't use multi delimiter serde in hive as we need to update the jar file),for example I want to split the following string with a two-character delimiter #$:

1234#$CSI#$MAN # NO#$MANN#$1212#$N

The desired result is:

1234 | CSI | MAN # NO | MANN |1212 | N

When I have tried #$ the result is

1234 | CSI | MAN | NO | MANN |1212 | N

How could I split the string with a two-character delimiter #$?

Hassan
  • 161
  • 1
  • 11
  • 3
    There's no need to use a regular expression. Just split it using the literal string `#$` as the delimiter. – Barmar Oct 04 '20 at 10:01
  • 1
    Have you read the documentation of what `[]` means in regular expressions? Every day I see a new regexp user who thinks it's for grouping. I don't understand why this happens so much. `[#$]` matches either `#` or `$`, it doesn't match them as a sequence. That's what `(#$)` is for. – Barmar Oct 04 '20 at 10:02
  • What language are you using? – Bohemian Oct 04 '20 at 10:13
  • 2
    Just escape `$`, `#\$`, see https://regex101.com/r/rz3tVG/1 – Wiktor Stribiżew Oct 04 '20 at 14:05
  • Thank you for your answers. I have also tried (#$) but this didn't work, what worked with me is using the escape ```#\$```. – Hassan Oct 10 '20 at 19:16

2 Answers2

0

Try this: (.*?)#\$(.*?)#\$(.*?)#\$(.*?)#\$(.*?)#\$(.*)

See: https://regex101.com/r/0KYFYi/1/

But as @Barmar said, splitting via #$ would be way easier.

Christian Baumann
  • 3,188
  • 3
  • 20
  • 37
0

I think this will do what you want: (.+?)(?:#\$|$)

Regex Tester

Patrick
  • 356
  • 1
  • 10