18
<root>
<tag>1</tag>
<tag1>2</tag1>
</root>

Need to change values 1 and 2 from bash

Roman
  • 185
  • 1
  • 3
  • 7
  • Change them to what? If that is your input, what is your desired output? – Anders Lindahl Jul 29 '11 at 12:32
  • change on values from global variables – Roman Jul 29 '11 at 12:52
  • solution by sed: sed 's#\([^<][^<]*\)#SOMETHING#'test.xml -i – Roman Jul 29 '11 at 13:37
  • @StefanoBorini, which has what to do with anything? Plenty of non-regex XML-manipulation tools accessible from bash. – Charles Duffy Feb 05 '15 at 23:32
  • @Roman, that's only a "solution" if you don't care about correctness. For instance, `` inside a `CDATA` section isn't a tag at all, but is text; `` inside `<--` and `-->` is a comment. `` under a subtree with `xmlns=http://example.com/foo` is `{http://example.com/foo}tag`, not `tag`. No sed expression is going to know the intricacies of XML syntax. – Charles Duffy Feb 22 '15 at 16:05

4 Answers4

27

To change tag's value to 2 and tag1's value to 3, using XMLStarlet:

xmlstarlet ed \
  -u '/root/tag' -v 2 \
  -u '/root/tag1' -v 3 \
  <old.xml >new.xml

Using your sample input:

xmlstarlet ed \
  -u '/root/tag' -v 2 \
  -u '/root/tag1' -v 3 \
  <<<'<root><tag>1</tag><tag1>2</tag1></root>'

...emits as output:

<?xml version="1.0"?>
<root>
  <tag>2</tag>
  <tag1>3</tag1>
</root>
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • I just checked and this tool can easily be installed via `apt-get install xmlstarlet` on Debian (even since Debian Jessie) and Ubuntu. – FibreFoX May 05 '21 at 09:44
12

You can use the xsltproc command (from package xsltproc on Debian-based distros) with the following XSLT sheet:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>
  <xsl:param name="tagReplacement"/>
  <xsl:param name="tag1Replacement"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>

  </xsl:template>
  <xsl:template match="tag">
    <xsl:copy>
      <xsl:value-of select="$tagReplacement"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="tag1">
    <xsl:copy>
      <xsl:value-of select="$tag1Replacement"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Then use the command:

xsltproc --stringparam tagReplacement polop \
         --stringparam tag1Replacement palap \
         transform.xsl input.xml

Or you could also use regexes, but modifying XML through regexes is pure evil :)

Mithfindel
  • 4,553
  • 1
  • 23
  • 32
10

my $0.02 in python because its on every server you will ever log in to

import sys, xml.etree.ElementTree as ET

data = ""
for line in sys.stdin:
    data += line

tree = ET.fromstring(data)

nodeA = tree.find('.//tag')
nodeB = tree.find('.//tag1')

tmp = nodeA.text
nodeA.text = nodeB.text
nodeB.text = tmp 

print ET.tostring(tree)

this reads from stdin so you can use it like this:

$ echo '<node><tag1>hi!</tag1><tag>this</tag></node>' | python xml_process.py 
<node><tag1>this</tag1><tag>hi!</tag></node>

EDIT - challenge accepted

Here's a working xmllib implementation (should work back to python 1.6). As I thought it would be more fun to stab my eyes with a fork. The only think I will say about this is it works for the given use case.

import sys, xmllib

class Bag:
    pass

class NodeSwapper(xmllib.XMLParser):
    def __init__(self):
    print 'making a NodeSwapper'
    xmllib.XMLParser.__init__(self)
    self.result = ''
    self.data_tags = {}
    self.current_tag = ''
    self.finished = False

    def handle_data(self, data):
    print 'data: ' + data

    self.data_tags[self.current_tag] = data
    if self.finished:
       return

    if 'tag1' in self.data_tags.keys() and 'tag' in self.data_tags.keys():
        b = Bag()
        b.tag1 = self.data_tags['tag1']
        b.tag = self.data_tags['tag']
        b.t1_start_idx = self.rawdata.find(b.tag1)
        b.t1_end_idx = len(b.tag1) + b.t1_start_idx
        b.t_start_idx = self.rawdata.find(b.tag)
        b.t_end_idx = len(b.tag) +  b.t_start_idx 
        # swap
        if b.t1_start_idx < b.t_start_idx:
           self.result = self.rawdata[:b.t_start_idx] + b.tag + self.rawdata[b.t_end_idx:]
           self.result = self.result[:b.t1_start_idx] + b.tag1 + self.result[b.t1_end_idx:]
        else:
           self.result = self.rawdata[:b.t1_start_idx] + b.tag1 + self.rawdata[t1_end_idx:]
           self.result = self.result[:b.t_start_idx] + b.tag + self.rresult[t_end_idx:]
        self.finished = True

    def unknown_starttag(self, tag, attrs):
    print 'starttag is: ' + tag
    self.current_tag = tag

data = ""
for line in sys.stdin:
    data += line

print 'data is: ' + data

parser = NodeSwapper()
parser.feed(data)
print parser.result
parser.close()
stringy05
  • 6,511
  • 32
  • 38
  • Python is everywhere, yes. Python new enough to have ElementTree in the standard library... that's a little iffier. – Charles Duffy Feb 06 '15 at 02:17
  • too true. but I would rather stab myself in the eye with a fork than try and do this with xmllib reliably. Actually that's a pretty good reason to use ruby (unless you happen to use solaris or HP-UX machines, in which case we have ended up at perl) – stringy05 Feb 06 '15 at 05:09
  • 2
    Well. I can't _not_ give an upvote, after that level of effort. :) – Charles Duffy Feb 06 '15 at 14:59
7

Since you give a sed example in one of the comments, I imagine you want a pure bash solution?

while read input; do
  for field in tag tag1; do
    case $input in
      *"<$field>"*"</$field>"* )
        pre=${input#*"<$field>"}
        suf=${input%"</$field>"*}
        # Where are we supposed to be getting the replacement text from?
        input="${input%$pre}SOMETHING${input#$suf}"
        ;;
    esac
  done
  echo "$input"
done

This is completely unintelligent, and obviously only works on well-formed input with the start tag and the end tag on the same line, you can't have multiple instances of the same tag on the same line, the list of tags to substitute is hard-coded, etc.

I cannot imagine a situation where this would be actually useful, and preferable to either a script or a proper XML approach.

tripleee
  • 175,061
  • 34
  • 275
  • 318