1

I have a file Afile :

<start>
<memory>
<hdd>10</hdd>
<hdc>40</hdc>
</memory>
<storage>
<disk>
<disk1>firstname</disk1>
</disk>
<disk>
<disk1>secondname</disk1>
</disk>
<map>
<code>1</code>
</map> 
<map>
<code>2</code>
</map> 
</storage>
</start>

I have the second file Bfile:

<disk>
<disk1>thirdname</disk1>
</disk>

How using sed I can insert content of Bfile into Afile. So finally I need to have the following file:

<start>
<memory>
<hdd>10</hdd>
<hdc>40</hdc>
</memory>
<storage>
<disk>
<disk1>firstname</disk1>
</disk>
<disk>
<disk1>secondname</disk1>
</disk>
<disk>
<disk1>thirdname</disk1>
</disk>
<map>
<code>1</code>
</map> 
<map>
<code>2</code>
</map> 
</storage>
</start>

So it should be inserted after the last pattern. When I use the following command I get the following result:

sed -e '/disk>/rBfile' Afile

<start>
<memory>
<hdd>10</hdd>
<hdc>40</hdc>
</memory>
<storage>
<disk>
<disk1>firstname</disk1>
</disk>
<disk>
<disk1>thirdname</disk1>
</disk>
<disk>
<disk1>secondname</disk1>
</disk>
<disk>
<disk1>thirdname</disk1>
</disk>
<map>
<code>1</code>
</map> 
<map>
<code>2</code>
</map> 
</storage>
</start>

So it put the content of Bfile after each occurence of "disk>". I need just the last occurence. How to change the command?

Dumitru Gutu
  • 579
  • 1
  • 7
  • 19
  • 3
    I would use [Using sed to insert file content](http://stackoverflow.com/a/11246712/1983854) using `/<\/storage>/` as pattern. – fedorqui May 28 '15 at 11:50
  • how to add afte second occurence of pattern ? so in my case the pattern is `/<\/disk>/` – Dumitru Gutu May 28 '15 at 12:08
  • if you mean the file may already contain `fourthname` then you should update your question. While it may be possible in `sed`, it will be much easier in `awk`. If you can accept an `awk` solution, also add an `awk` tag. Good luck. – shellter May 28 '15 at 12:15
  • awk is fine also, so i have Afile and Bfile and need result ` 10 40 firstname secondname thirdname ` – Dumitru Gutu May 28 '15 at 12:19
  • You are working with xml, not with text. Use XSLT for that. – hek2mgl May 28 '15 at 12:19
  • 3
    Are you sure you really need to insert "after second"? It seems to be better thinking on inserting BEFORE tag as @fedorqui said. – Alejandro Teixeira Muñoz May 28 '15 at 12:25

6 Answers6

3

I didn't manage to do that in a single line so i made a sed script. The problem is that the r command will not work if there are chars after the file name so it needs to be on it's own line.

#!/bin/sed -f

/<\/disk>/{
  :a 
  n
  s/disk/disk/
  t a
  h
  r bbb
  g
  N
}

You can then call it like this :

sed -f sedscript Afile
Alfwed
  • 3,307
  • 2
  • 18
  • 20
3

XML (like structured data in general) shouldn't be handled with plain-text tools like awk and sed except in very special cases because nobody expects XML tools to break if newlines change places or spaces are inserted/removed in benign places.

Instead, I'd use Python, which has an XML parser in its standard library:

#!/usr/bin/python

import xml.etree.ElementTree as ET;
import sys;

# file names taken from command line arguments.
target = ET.parse(sys.argv[1]);
insert = ET.parse(sys.argv[2]);

# Interesting part here:    
target.getroot().find("./storage").append(insert.getroot())

# to write to a file, use target.write('output.xml')
ET.dump(target)

Call that as

python foobar.py fileA fileB
Wintermute
  • 42,983
  • 5
  • 77
  • 80
2

if limited by storage (first sample given)

sed '\#</storage># {r Bfile
   N;} ' Afile

if last disk in storage (like this edited version of the request)

sed '1;\#<storage>#{1h;1!H
    \#<storage># {g
       s#^\(.*\n</disk>\).*#\1#p
       r Bfile
       G;N
       s/^\(.*\)\1\(.*\)/\2/
       }
   }' Afile

Normaly sed script loop to next line after a r action (and does not read rest of script for this line) but with a N after, it continue AND keep the line in buffer for action (in this case with the next one).

So only works IF there is a line after storage (could add a test before with a if/the/else action in this case)

NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43
2

Just to add some examples using AWK.

Assuming that we have:

afile:

<start>
<memory>
<hdd>10</hdd>
<hdc>40</hdc>
</memory>
<storage>
<disk>
<disk1>firstname</disk1>
</disk>
<disk>
<disk1>secondname</disk1>
</disk>
</storage>
</start>

and bfile:

<disk>
<disk1>thirdname</disk1>
</disk>

AWK using </storage> tag as reference:

awk '/^<\/storage>/{while(getline line<"bfile"){print line};print;next}1' afile

That will result in:

<start>
<memory>
<hdd>10</hdd>
<hdc>40</hdc>
</memory>
<storage>
<disk>
<disk1>firstname</disk1>
</disk>
<disk>
<disk1>secondname</disk1>
</disk>
<disk>
<disk1>thirdname</disk1>
</disk>
</storage>
</start>

But in case you REALLY need to look for </disk>, I would do something like:

awk -v n=4 '{print;}/<\/disk1>$/,/^<\/disk>/{m++}(m==n){n=0;while(getline l<"bfile"){print l}}' afile

In addition, you can also use xmllint to format the output for you:

awk -v n=4 '{print;}/<\/disk1>$/,/^<\/disk>/{m++}(m==n){n=0;while(getline l<"bfile"){print l}}' afile | xmllint --format --recover -

That will result in:

<start>
  <memory>
    <hdd>10</hdd>
    <hdc>40</hdc>
  </memory>
  <storage>
    <disk>
      <disk1>firstname</disk1>
    </disk>
    <disk>
      <disk1>secondname</disk1>
    </disk>
    <disk>
      <disk1>thirdname</disk1>
    </disk>
  </storage>
</start>
0

If ed is an option (if the input file is not too big), it would be easier :

echo '/map/-1 r Bfile
wq' | ed Afile
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
0

This might work for you (GNU sed):

sed -e '/<disk>/,${/<disk>/,/<\/disk>/b;ecat fileb' -e ':a;n;ba}' filea

This restricts the sed commands to those lines beginning with <disk> to the end of the file. Within this range all complete <disk>/<\/disk> tags are printed as usual. The following line is where the file is to be inserted and using the sed evalute command the file is immediately inserted (rather than using the r command which inserts the file following the current pattern space). The rest of the file is then printed using a simple loop.

potong
  • 55,640
  • 6
  • 51
  • 83