3

In regex101: https://regex101.com/r/FM88LA/1

enter image description here

In my browser console:

x='"AbCd123|999"';
"\"AbCd123|999\""
x.match(/[^\""|]+/)
Array [ "AbCd123" ]

Using sed in the shell:

(base) balter@winmac:~/winhome/CancerGraph/TCGA$ echo '"AbCd123|99999"' | sed -En 's/([^\"|]+)/\1/p'
"AbCd123|99999"
(base) balter@winmac:~/winhome/CancerGraph/TCGA$ echo '"AbCd123|99999"' | sed -En 's/\"([^|]+)/\1/p'
AbCd123|99999"
Jared Smith
  • 19,721
  • 5
  • 45
  • 83
abalter
  • 9,663
  • 17
  • 90
  • 145
  • What does this have to do with bash? You're using sed. – Jared Smith Dec 30 '21 at 20:28
  • 1
    sed uses extended posix, it wont work with your standard pcre – skara9 Dec 30 '21 at 20:44
  • 1
    You don't need to escape `"` in regular expressions. – Barmar Dec 30 '21 at 20:46
  • What's the expected output? In the first command the capture group contains the entire match, so you're just replacing it with itself and the output is the same as the input. – Barmar Dec 30 '21 at 20:50
  • In the second command, the capture group doesn't include the `"`, so it removes the `"`. – Barmar Dec 30 '21 at 20:51
  • To answer the questions in order: 1) I just wasn't sure if bash had it's own version of sed, so just wanted to be clear. 2) In javascript it seemed like I needed to, so I figured it wouldn't hurt. 3) The expected output is what I got in javascript and what is shown in the regex101 example. – abalter Dec 30 '21 at 22:09

1 Answers1

3

That is all fine, because sed command used with -n option and p flag only prints the text that was not matched + the result of the successful replacement.

That means, you can get your "match" with

echo '"AbCd123|99999"' | sed -En 's/["|]*([^"|]+).*/\1/p'

See the online demo.

Here, the pattern gets to the first char that is not " and | with ["|]*, then the ([^"|]+) part captures one or more chars other than " and |, and then .* matches the rest of the string.

Everything that was matched but not captured is removed as you only ask to print the \1, the Group 1 value (captured with ([^"|]+)).

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 1
    Thank you! I didn't think I needed to match all the stuff outside the capture group. But that definitely did the trick. – abalter Dec 30 '21 at 22:12