0

I have a file like this:

foo and more
stuff
various stuff
variable number of lines
with a bar
Stuff I want to keep
More stuff I want to Keep
These line breaks are important

I want to replace everything between foo and bar so that I get:

foo testtext bar
Stuff I want to keep
More stuff I want to Keep
These line breaks are important

as recommended in another thread I tried: sed -e '/^foo/,/^bar/{/^foo/b;/^bar/{i testtext' -e 'b};d}' file.txt

Is there a more general-purpose solution to find and replace everything between foo and bar, absolutely no matter what it is?

Stonecraft
  • 860
  • 1
  • 12
  • 30
  • Do you understand meaning of `^bar` in your command? – SMA Aug 23 '15 at 09:31
  • Now that I've looked the `^` up, I see why that gave me the output it did, but I still have no idea how to do what I was trying to do. I suspect the `/,/` and the `/b;/` have something to do with it. Is there some trick to googling stuff with lots of punctuation, like sed commands? Searching `"/b;/"` for example finds all sorts of things that are not that precise string. – Stonecraft Aug 23 '15 at 09:43
  • Give this a try: remove all four `^`. – Cyrus Aug 23 '15 at 10:20
  • Nope, removing the `^` is not sufficient (still leaves line 1 as `you foo and more` – Stonecraft Aug 23 '15 at 21:43
  • Thanks to everyone who answered. Is there a simple explanation as to why it takes so much complicated syntax to do this? I really thought there was probably an option or something that I didn't know about, I would never have suspected that it would take all this just to make the equivalent of an all-inclusive wildcard (which isn't even a thing that exists?) Apologies if these are naive questions, I am still pretty new to scripting. – Stonecraft Aug 23 '15 at 21:46
  • OK, so again, when I try to use it in my actual real situation (or insert `foo` and `bar` into my test file, I still do not get the expected output. Now please don't kill me folks, but is the problem that I am trying to work on an html file and that regex and html are bad bad bad? I know this has been gone over 1000x, but it was my understanding that the bad thing was trying to use regex to search for nested search terms, tags etc. However all I want to do is search and replace between two normal text strings in a file that happens to have tags in it. Is Cthulhu still going to come for me? – Stonecraft Aug 23 '15 at 23:49

1 Answers1

1

You can use the following sed script:

replace.sed:

# Check for "foo"
/\bfoo\b/    {   
    # Define a label "a"
    :a  
    # If the line does not contain "bar"
    /\bbar\b/!{
        # Get the next line of input and append
        # it to the pattern buffer
        N
        # Branch back to label "a"
        ba
    }   
    # Replace everything between foo and bar
    s/\(\bfoo\)\b.*\b\(bar\b\)/\1TEST DATA\2/
}

Call it like this:

sed -f extract.sed input.file

Output:

fooTEST DATAbar
Stuff I want to keep
More stuff I want to Keep
These line breaks are important

If you want to pass the begin and ending delimiter using a shell script you can do it like this (comments removed for brevity):

#!/bin/bash

begin="foo"
end="bar"

replacement=" Hello world "

sed -r '/\b'"$begin"'\b/{
    :a;/\b'"$end"'\b/!{
        N;ba
    }
    s/(\b'"$begin"')\b.*\b('"$end"'\b)/\1'"$replacement"'\2/
}' input.file

The above works as long as $start and $end won't contain regex special characters, to escape them properly use the following code:

#!/bin/bash

begin="foo"
end="bar"
replace=" Hello\1world "

# Escape variables to be used in regex
beginEsc=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<<"$begin")
endEsc=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<<"$end")
replaceEsc=$(sed 's/[&/\]/\\&/g' <<<"$replace")

sed -r '/\b'"$beginEsc"'\b/{
    :a;/\b'"$endEsc"'\b/!{
        N;ba
    }
    s/(\b'"$beginEsc"')\b.*\b('"$endEsc"'\b)/\1'"$replaceEsc"'\2/
}' input.file
Community
  • 1
  • 1
hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • I made a slightly different case where the script works with some strings but not others (none of them using any special characters). I started a new thread about it here: http://stackoverflow.com/questions/32174477/what-is-the-key-difference-between-searching-for-and-deleting-between-these-two – Stonecraft Aug 24 '15 at 06:10
  • 1
    The new thread implies that you are about to parse HTML/XML with regexes. This will not work. Use a DOM parser for that. – hek2mgl Aug 24 '15 at 07:25
  • So what I don't understand about regexs and HTML/XML is why this causes problems even when the actual text I am searching for is just regular plain old text. I am not trying to edit XML tags in any way, except to delete them when they happen to occur between two strings of word characters. I mean, does the mere fact that H/XML-looking tag structures exist somewhere in the text file necessitate the use of a DOM parser? – Stonecraft Aug 24 '15 at 07:38
  • No, as long as you don't attempt to parse it as a language you are fine. The problem in that thread is that you need `\bdesc\` instead of `\bdesc\b`. – hek2mgl Aug 24 '15 at 07:41