Prepend a String to a Regex Match Using Bash

Question

I have a file that has a certain string on some lines that takes the form of either [Name](./path_to_name_directory) or [Name](./path_to_name_directory/notes.md) each with some unimportant list items. For the lines that do not have notes.md at the end of the file path within the parenthesis, Name gets prepended to that line.

To try and solve this, I originally had the following command

sed 's/\[(.*)\]\(\.\/.*\/(?!notes\.md)/\1&/g' ./file.md

but I eventually found out that sed does not support lookaheads or lookbehinds, so I moved to using perl to try and accomplish the same. I thought it would be as simple as doing

perl -pe 's/\[(.*)\]\(\.\/.*\/(?!notes\.md)/\1&/g'

but it did not work, and I'm not entirely sure where to go from here.

EDIT 1:

Sample Input File:

- [Name 1](./path_to_name_1)
  - Unimportant list item.
- [Name 2](./path_to_name_2/notes.md)
  - Unimportant list item.

Sample Output File:

- Name 1 [Name 1](./path_to_name_1)
  - Unimportant list item.
- [Name 2](./path_to_name_2/notes.md)
  - Unimportant list item.

https://stackoverflow.com/questions/9053100/sed-regex-and-substring-negation check out this answer as a workaround: you can use a sed rule to match lines that end with "notes.md" and do nothing, plus a rule that matches what remains and does something. — Tordek, Apr 04 '21 at 06:00
@Shawn , apologies. I added sample input and sample output in the edit. — Kalcifer, Apr 04 '21 at 06:02
@Kalcifer, I would request you to add your code(which you have come up) in your answer, yes questioners could answer as well to their question. By doing this you are keeping your question clear(you could edit your question and make it like how it was asked to avoid confusions; since there is an answer section available for that), thank you. — RavinderSingh13, Apr 04 '21 at 06:52

score 3 · Answer 1 · answered Apr 04 '21 at 06:22

3

With your shown samples, please try following.

awk '
!/notes\.md\)$/ && match($0,/\[Name [0-9]+/){
  $1=$1 OFS substr($0,RSTART+1,RLENGTH-1)
}
1
' Input_file

Explanation: Adding detailed explanation for above. This is only for explanation purposes.

awk '
##Starting awk program from here.
!/notes\.md\)$/ && match($0,/\[Name [0-9]+/){
##Checking condition if current does not end with notes.md) then match [Name digits in current line.
  $1=$1 OFS substr($0,RSTART+1,RLENGTH-1)
##Re-create 1st field which has current $1 OFS and sub string of matched regex value.
}
1
##This will print current edited/non-edited line here.
' Input_file ##Mentioning Input_file name here.

answered Apr 04 '21 at 06:22

RavinderSingh13

130,504
14
57
93

1

Came with something similar but you were faster :-) Didn't know about the `1` standing alone to print the line. – Ludovic Kuty Apr 04 '21 at 06:29
@LudovicKuty, thank you :) yeah 1 prints the line in awk, it's easiest method to print current line, cheers – RavinderSingh13 Apr 04 '21 at 06:31
1

It all became clear when I read https://stackoverflow.com/a/24643330/452614. Should have thought about it. – Ludovic Kuty Apr 04 '21 at 07:00

Kalcifer · Accepted Answer · 2021-04-04T07:14:02.560

An option that I came up with using @ RavinderSingh13's answer and this related answer is the following

sed -E '/.*notes\.md.*/!s/\[(.*)\]/\1&/g'

/.*notes\.md.*/! tells sed to not match if the string matches that regex. In other words, sed will only match lines that do not match the address specification (See code block #5 of section 4.1 of the GNU Sed Manual).

s/\[(.*)\]/\1&/g tells sed to capture group the inner string of the square brackets and prepend it to the entire match; the portion of the regex that accomplishes the positional placement is \1&, where \1 is the capture group, and & references the entire matched portion of the string.)

Carlos Pascual · Answer 3 · 2021-04-04T10:39:37.413

1

With awk, you can set FS as [][]|/|). This way you can get the content of $2 and $5 and put the condition.

awk -v FS='[][]|/|)' '$2 ~ /^Name [[:digit:]]/ && $5 !~ /notes.md/ {sub(/^. /, "&"$2" " , $0)} 1' file
- Name 1 [Name 1](./path_to_name_1)
  - Unimportant list item.
- [Name 2](./path_to_name_2/notes.md)
  - Unimportant list item.

edited Apr 04 '21 at 10:39

answered Apr 04 '21 at 10:24

Carlos Pascual

1,106
1
5
8

Prepend a String to a Regex Match Using Bash

3 Answers3