Use Sed to replace next line but keep the white space

Question

I'm using this answer here: https://stackoverflow.com/a/18622953/1797263 to replace a version in a pom.xml file. The problem I'm running into is that it is stripping the preceding whitespace and I want to keep the preceding whitespace. The whitespace could be 2 or 3 tabs or spaces, depending on how the developer formatted the file.

Here is an example:

        <dependency>
            <groupId>GROUP</groupId>
            <artifactId>ARTIFACT</artifactId>
            <version>OLD_VERSION</version>
        </dependency>

My command: sed -i '/<artifactId>ARTIFACT<\/artifactId>/!b;n;c<version>NEW_VERSION</version>' pom.xml

And my output:

        <dependency>
            <groupId>GROUP</groupId>
            <artifactId>ARTIFACT</artifactId>
<version>NEW_VERSION</version>
        </dependency>

Here is what I would like the replacement to look like:

        <dependency>
            <groupId>GROUP</groupId>
            <artifactId>ARTIFACT</artifactId>
            <version>NEW_VERSION</version>
        </dependency>

I read through the GNU Sed manual and could not find anything that would help.

[Don't Parse XML/HTML With Regex.](https://stackoverflow.com/a/1732454/3776858) I suggest to use an XML/HTML parser (xmlstarlet, xmllint ...). — Cyrus, Dec 24 '19 at 15:41

Gilles Quénot · Answer 1 · 2019-12-25T13:43:07.600

Using a proper xml parser :

xmlstarlet edit -L -u '/dependency/version' -v NEW_VERSION file.xml

Output

<?xml version="1.0"?>
<dependency>
  <groupId>GROUP</groupId>
  <artifactId>ARTIFACT</artifactId>
  <version>NEW_VERSION</version>
</dependency>

Don't parse XML/HTML with regex, use a proper XML/HTML parser and a powerful xpath query.

theory :

According to the compiling theory, XML/HTML can't be parsed using regex based on finite state machine. Due to hierarchical construction of XML/HTML you need to use a pushdown automaton and manipulate LALR grammar using tool like YACC.

realLife©®™ everyday tool in a shell :

You can use one of the following :

xmllint often installed by default with libxml2, xpath1 (check my wrapper to have newlines delimited output

xmlstarlet can edit, select, transform... Not installed by default, xpath1

xpath installed via perl's module XML::XPath, xpath1

xidel xpath3

saxon-lint my own project, wrapper over @Michael Kay's Saxon-HE Java library, xpath3

or you can use high level languages and proper libs, I think of :

python's lxml (from lxml import etree)

perl's XML::LibXML, XML::XPath, XML::Twig::XPath, HTML::TreeBuilder::XPath

ruby nokogiri, check this example

php DOMXpath, check this example

Check: Using regular expressions with HTML tags

score 1 · Answer 2 · answered Dec 24 '19 at 23:09

1

This might work for you (GNU sed):

sed -i '/<artifactId>ARTIFACT<\/artifactId>/{n;s/\S.*/<version>NEW_VERSION<\/version>/}' file

Overwrite the old version with the new version using the first non-whitespace character as a starting place for the replacing string.

answered Dec 24 '19 at 23:09

potong

55,640
6
51
83

Me voted down (-1) because giving bad practices to OP. Edited my answer with explanations 'why not parsing xml with regex' – Gilles Quénot Dec 25 '19 at 13:43
@GillesQuenot no problem. I voted you up (+1) for using the best tool. – potong Dec 25 '19 at 23:11
I haven't asked a ton of questions on SO, so I'm not exactly sure what the right etiquette here is. This answer does exactly what I was asking for, using the tool i was trying to use. But there is another answer which is supposedly a 'best practice' on Linux. Which answer should I mark as correct? – Chris Savory Jan 23 '20 at 16:40
I think this answer is correct. It really answers the question which was about the use of sed. There may be many reason why someone don't (can't) use any other tool. – Squake Jul 27 '23 at 13:01

Use Sed to replace next line but keep the white space

2 Answers2

Output

theory :

realLife©®™ everyday tool in a shell :

or you can use high level languages and proper libs, I think of :