3

Similar to my old question: How do I merge FileA.txt and FileB.txt giving FileB.txt overwrite power using a bash script?

I want to merge two configuration value files, again I have fileA and fileB. I want all the lines from fileB in fileA, if the same configuration key appears in both files, I want the value in fileB to overwrite the value in fileA.

Each line always starts with 'config', then there is a key and lastly a value. The part that makes it hard is that the value can be a quoted string with space bars to separate multiple values (see 'website' value).

I've got some experience using awk and a fair bit of bash experience but I can't for the life of me figure out a way to do this. All help is appreciated. Thanks

fileA:

config lanIP 10.1.1.1
config wanIP 1.1.1.1
config wanIPMask 255.255.255.255
config website "http://google.com http://yahoo.com"

fileB:

config lanIP 192.168.1.1
config wanIP 1.1.1.1
config website "http://google.com http://yahoo.com"
config moreWebsite "http://google.com http://msn.com"

Expected output:

config lanIP 192.168.1.1
config wanIP 1.1.1.1
config wanIPMask 255.255.255.255
config website "http://google.com http://yahoo.com"
config moreWebsite "http://google.com http://msn.com"
Community
  • 1
  • 1
bramford
  • 67
  • 1
  • 6

3 Answers3

5

this one-liner may help:

awk '{a[$2]=$0}END{for(x in a)print a[x]}' fileA fileB

Note: the line above is short but doesn't keep the order of lines. (you didn't mention the sorting criteria)

test:

kent$  head a b
==> a <==
config lanIP 10.1.1.1
config wanIP 1.1.1.1
config wanIPMask 255.255.255.255
config website "http://google.com http://yahoo.com"

==> b <==
config lanIP 192.168.1.1
config wanIP 1.1.1.1
config website "http://google.com http://yahoo.com"
config moreWebsite "http://google.com http://msn.com"

kent$  awk '{a[$2]=$0}END{for(x in a)print a[x]}' a b                       
config wanIP 1.1.1.1
config lanIP 192.168.1.1
config moreWebsite "http://google.com http://msn.com"
config wanIPMask 255.255.255.255
config website "http://google.com http://yahoo.com"

if you want to have the same order in your question, try this one-liner:

awk '!($2 in a){i[NR]=$2}{a[$2]=$0}END{for(x=1;x<=NR;x++)if(x in i)print a[i[x]]}' a b

test

kent$  awk '!($2 in a){i[NR]=$2}{a[$2]=$0}END{for(x=1;x<=NR;x++)if(x in i)print a[i[x]]}' a b
config lanIP 192.168.1.1
config wanIP 1.1.1.1
config wanIPMask 255.255.255.255
config website "http://google.com http://yahoo.com"
config moreWebsite "http://google.com http://msn.com"
Kent
  • 189,393
  • 32
  • 233
  • 301
  • Using the whole line as value was a spark of genius... Love it :) I couldn't find an equivalent of the maxsplit parameter of the string.split Python functions in awk, was thinking about regular expressions, but all these parsing wasn't, in fact, necessary. Great. – piokuc Nov 14 '12 at 10:20
  • Great solution. Exactly what I was after. – bramford Nov 14 '12 at 19:55
0

In case you don't mind using Python here is a little script doing what you want. It should be quite straightforward to translate it to awk. The general idea is that you process the files in order and fill a dictionary, the values from the files processed later overwrite values from files processed earlier:

import sys

options = {}
for fileName in sys.argv[1:]:
    with open(fileName) as f:
        for line in f:
            parts = line.strip().split(' ', 2)
            if len(parts) == 3:
                options[parts[1]] = parts[2]

for k in options:
    print 'config', k, options[k]

You invoke the script like this:

python merge.py fileA fileB
piokuc
  • 25,594
  • 11
  • 72
  • 102
  • Also a great solution. I am trying to use only bash/awk for this project so I won't use this. However, I am currently learning Python so it is great to see an example in Python. – bramford Nov 14 '12 at 19:56
  • I'm glad it's useful. As an exercise in Python you can rework it so it works like Kent's awk script. BTW. Don't forget to close the question. – piokuc Nov 15 '12 at 09:40
0

Perl solution:

#!/usr/bin/perl
use warnings;
use strict;

sub get_key_value {
    my $line = shift;
    die "Invalid line $line" unless $line =~ /^config /;
    chomp $line;
    return (split / /, $line, 3)[1, 2];
}

my %result;

open my $MINOR, '<', 'fileA' or die "Cannot open fileA: $!";
while (<$MINOR>) {
    my ($key, $value) = get_key_value($_);
    $result{$key} = $value;
}

open my $MAJOR, '<', 'fileB' or die "Cannot open fileB: $!";
while (<$MAJOR>) {
    my ($key, $value) = get_key_value($_);
    delete $result{$key};
    print "config $key $value\n";
}

for my $rest (keys %result) {
    print "config $rest $result{$rest}\n";
}
choroba
  • 231,213
  • 25
  • 204
  • 289