0

Problem: I have two folders (one is Delta Folder-where the files get updated, and other is Original Folder-where the original files exist). Every time the file updates in Delta Folder I need merge the file from Original folder with updated file from Delta folder.

Note: Though the file names in Delta folder and Original folder are unique, but the content in the files may be different. For example:

$ cat Delta_Folder/1.properties
account.org.com.email=New-Email
account.value.range=True

$ cat Original_Folder/1.properties
account.org.com.email=Old-Email
account.value.range=False
range.list.type=String
currency.country=Sweden

Now, I need to merge Delta_Folder/1.properties with Original_Folder/1.properties so, my updated Original_Folder/1.properties will be:

account.org.com.email=New-Email 
account.value.range=True
range.list.type=String
currency.country=Sweden

Solution i opted is:

find all *.properties files in Delta-Folder and save the list to a temp file(delta-files.txt).

find all *.properties files in Original-Folder and save the list to a temp file(original-files.txt)

then i need to get the list of files that are unique in both folders and put those in a loop.

then i need to loop each file to read each line from a property file(1.properties).

then i need to read each line(delta-line="account.org.com.email=New-Email") from a property file of delta-folder and split the line with a delimiter "=" into two string variables.

(delta-line-string1=account.org.com.email; delta-line-string2=New-Email;)

then i need to read each line(orig-line=account.org.com.email=Old-Email from a property file of orginal-folder and split the line with a delimiter "=" into two string variables.

(orig-line-string1=account.org.com.email; orig-line-string2=Old-Email;)

if delta-line-string1 == orig-line-string1 then update $orig-line with $delta-line
 i.e: 
if account.org.com.email == account.org.com.email then replace 

account.org.com.email=Old-Email in original folder/1.properties with 

account.org.com.email=New-Email

Once the loop finishes finding all lines in a file, then it goes to next file. The loop continues until it finishes all unique files in a folder.

For looping i used for loops, for splitting line i used awk and for replacing content i used sed.

Over all its working fine, its taking more time(4 mins) to finish each file, because its going into three loops for every line and splitting the line and finding the variable in other file and replace the line.

Wondering if there is any way where i can reduce the loops so that the script executes faster.

William Pursell
  • 204,365
  • 48
  • 270
  • 300
kumar
  • 389
  • 1
  • 9
  • 28

3 Answers3

1

With paste and awk :

File 2:

$ cat /tmp/l2
account.org.com.email=Old-Email
account.value.range=False
currency.country=Sweden
range.list.type=String

File 1 :

$ cat /tmp/l1
account.org.com.email=New-Email
account.value.range=True

The command + output :

paste /tmp/l2 /tmp/l1 | awk '{print $NF}'
account.org.com.email=New-Email
account.value.range=True
currency.country=Sweden
range.list.type=String

Or with a single awk command if sorting is not important :

awk -F'=' '{arr[$1]=$2}END{for (x in arr) {print x"="arr[x]}}' /tmp/l2 /tmp/l1
Prince John Wesley
  • 62,492
  • 12
  • 87
  • 94
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
0

I think your two main options are:

  1. Completely reimplement this in a more featureful language, like perl.
  2. While reading the delta file, build up a sed script. For each line of the delta file, you want a sed instruction similar to:

    s/account.org.com.email=.*$/account.org.email=value_from_delta_file/g
    

That way you don't loop through the original files a bunch of extra times. Don't forget to escape & / and \ as mentioned in this answer.

Community
  • 1
  • 1
John Watts
  • 8,717
  • 1
  • 31
  • 35
  • thanks for the suggestion, it seems second option looks like a good one to start. – kumar Jun 29 '12 at 15:33
  • In fact you can use sed to generate a sed script from delta file. `sed 's/^\([^=]*\)=\(.*\)/s#\1=.*#\1=\2#/' new_file | sed -f - old_file` (looks horrible but works for me) – aragaer Apr 30 '13 at 14:15
0

Is using a database at all an option here?

Then you would only have to write code for extracting data from the Delta files (assuming that can't be replaced by a database connection).

It just seems like this is going to keep getting more complicated and slower as time goes on.