Suppose I have the following markdown
list items:
- [x] Example of a completed task.
- [x] ! Example of a completed task.
- [x] ? Example of a completed task.
I am interested to parse that item using regex
and extract the following group captures:
$1
: the left[
and the right]
brackets when the symbolx
is in-between$2
: the symbolx
in between the brackets[
and]
$3
: the modifier!
that follows after[x]
$4
: the modifier?
that follows after[x]
$5
: the text that follows[x]
without a modifier, e.g.,[x] This is targeted.
$6
: the text that follows[x] !
$7
: the text that follows[x] ?
After a lot of trial-and-error using online parsers, I came up with the following:
((?<=x)\]|\[(?=x]))|((?<=\[)x(?=\]))|((?<=\[x\]\s)!(?=\s))|((?<=\[x\]\s)\?(?=\s))|((?<=\[x\]\s)[^!?].*)|((?<=\[x\]\s!\s).*)|((?<=\[x\]\s\?\s).*)
To make the regex
above more readable, these are the capture groups listed one by one:
$1
:((?<=x)\]|\[(?=x]))
$2
:((?<=\[)x(?=\]))
$3
:((?<=\[x\]\s)!(?=\s))
$4
:((?<=\[x\]\s)\?(?=\s))
$5
:((?<=\[x\]\s)[^!?].*)
$6
:((?<=\[x\]\s!\s).*)
$7
:((?<=\[x\]\s\?\s).*)
This is most likely not the best way to do it, but at least it seems to capture what I want:
I would like to extend that regex
to capture lines in a markdown
table that
looks like this:
| | Task name | Plan | Actual | File |
| :---- | :-------------------------------------- | :---------: | :---------: | :------------: |
| [x] | Task one with a reasonably long name. | 08:00-08:45 | 08:00-09:00 | [[task-one]] |
| [x] ! | Task two with a reasonably long name. | 09:00-09:30 | | [[task-two]] |
| [x] ? | Task three with a reasonably long name. | 11:00-13:00 | | [[task-three]] |
More specifically, I am interested in having the same group captures as above, but I would like to exclude the table grid (i.e., the |
). So, groups $1
to $4
should stay the same, but groups $5
to $7
should capture the text, excluding the |
, e.g., like in the selection below:
Do you have any ideas on how I can adjust, for example, the regex for group $5
to exclude the |
. I have endlessly tried all sorts of negations (e.g., [^\|]
). I am using Oniguruma regular expressions.