This works too:
(((\|start\|[^\;]*\; (?=\|transition\|[^\;]*\; \|end\|.*)))|((\|start\|[^\;]*\; \|end\|.*)))
Discussion
I think the generic form of your question is this:
- If there exists a string "${start}${transition}${end}"
- Where "start","transition", and "end" are variable strings with the format "tag content semicolon space"
- How does one conditionally grab parts of the string
- The conditions being:
a) if transition tag exists return "$start"
b) else return "${start}${end}"
Logic in regex can be accomplished by explicitly stating all acceptable scenarios, here's some bash to play around with our regex:
tst1="|start| example1; |transition| example2; |end| example3"
tst2="|start| example1; |end| example3"
tst3="|start| sky is blue today; |transition| it is raining; |end|"
tst4="|start| sky is blue today; it is raining; |end|"
tst5="|start| sky is blue today; |end|"
start='|start|[^\;]*\; ' # start marker, 0+ of any character but a semicolon, then a semicolon, then a space
start="${start//\|/\\|}" # escape |'s
transition='|transition|[^\;]*\; ' # transition marker, 0+ of any character but a semicolon, then a semicolon, then a space
transition="${transition//\|/\\|}" # escape |'s
end='|end|.*' # end marker, 0+ of any character
end="${end//\|/\\|}" # escape |'s
start_when_transition="(${start}(?=${transition}${end}))" # match start if transition and end
end_when_transition="(${start}${transition}\K${end})" # match end if begining and transition
start_and_end="(${start}${end})" # match start and end when no transition in the middle
ifTransition="(${start_when_transition})"
else="(${start_and_end})"
echo tst1: $tst1
echo $tst1 | grep -oP "(${ifTransition}|${else})" | xargs echo -e "\t"
echo -----------------------------------------------------------------
echo tst2: $tst2
echo $tst2 | grep -oP "(${ifTransition}|${else})" | xargs echo -e "\t"
echo -----------------------------------------------------------------
echo tst3: $tst3
echo $tst3 | grep -oP "(${ifTransition}|${else})" | xargs echo -e "\t"
echo -----------------------------------------------------------------
echo tst4: $tst4
echo $tst4 | grep -oP "(${ifTransition}|${else})" | xargs echo -e "\t"
echo -----------------------------------------------------------------
echo tst5: $tst5
echo $tst5 | grep -oP "(${ifTransition}|${else})" | xargs echo -e "\t"
output:
tst1: |start| example1; |transition| example2; |end| example3
|start| example1;
-----------------------------------------------------------------
tst2: |start| example1; |end| example3
|start| example1; |end| example3
-----------------------------------------------------------------
tst3: |start| sky is blue today; |transition| it is raining; |end|
|start| sky is blue today;
-----------------------------------------------------------------
tst4: |start| sky is blue today; it is raining; |end|
-----------------------------------------------------------------
tst5: |start| sky is blue today; |end|
|start| sky is blue today; |end|
Bash reviewed
- echo is a string printing program
- echo -e allows for extended string stuff like "\t" for tab
- grep is a string matching program
- grep -oP -> -o is for --only-matching and -P is for Perl, an extended regex launguage
- | aka "pipe", takes the output from the last command and feeds it into the next
- xargs is a program takes its input and adds it as arguments to the following command
- $variablename access variable we set
- "${variablename}" access variable we set within a string
Regex reviewed
- \K if you made it this far, great, but forget everything you just matched
- ?= look ahead to see if somethings there but don't match
- () scope conditions
- | or
- [] match any characters listed- character class
- [^] match any characters but the ones listed
- \ escape special character
Regex combinations reviewed
- [abc]* - match a, b, or c 0+ times
- foo(?=bar) match foo if bar comes right after
References