-1

I try to write a script that would replace each word of a file by another corresponding word according to a list file.

phylofile (the file to be modified) is:

(((swallowtail,noctuid):90,pyraloid):74,crambine)

namefile (the list of mappings from old to new words) is:

crambine orocrambus
swallowtail papilio
noctuid catocala

The output should be:

(((papilio,catocala):90,pyraloid):74,orocrambus)

I hope it is more clear like that

I wrote the following script:

echo -n "Enter the path to the file where names should be changed: "
read phylofile
echo -n "Enter the path to the file containing the string searched and the replacing string: "
read namefile

while read var
do
    searchstring=`echo "$var"|awk -F= '{print $1}'`
    replacestring=`echo "$var"|awk -F= '{print $2}'`
    sed "s/$searchstring/$replacestring/g" $phylofile > outputfile
done < $namefile

I get an error message (French) meaning there is no regular expression in the sed command.

I would be really thankful if you could help

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
jibbah
  • 29
  • 5
  • Indeed, Cyrus is right. You should learn how format source code in the question box. However, you are new, I did it this time for you. Back to the problem, you might have look here: http://stackoverflow.com/questions/29613304/is-it-possible-to-escape-regex-metacharacters-reliably-with-sed – hek2mgl Sep 23 '15 at 22:16
  • you can do everything in sed, why use awk? – midori Sep 23 '15 at 22:27
  • Wrong question. Other than some trivial operations, you can and should do all UNIX text manipulation using awk so why use shell and sed? – Ed Morton Sep 24 '15 at 03:22
  • Thank you for these answers. Yes, I am pretty sure I can do it with sed, but I didn't find how.. – jibbah Sep 24 '15 at 06:49
  • sed is for simple substitutions on individual lines, which this is not, so you can't do it in sed. You CAN do it in a mix of shell and sed but that's VERY hard to code robustly (e.g. your `while read var` will in general corrupt what you're reading - you need to add additional constructs before `var` is guaranteed to contain the value that existed in the file) because it's the completely wrong approach as shell is for manipulating files and processes and sequencing calls to tools. The guys who invented shell and sed also invented awk to do general purpose text manipulation like this so use it. – Ed Morton Sep 24 '15 at 15:09

1 Answers1

0

Something like this is all you need:

echo -n "Enter the path to the file where names should be changed: "
read phylofile
echo -n "Enter the path to the file containing the string searched and the replacing string: "
read namefile

awk '
NR==FNR{ map["\\<"$1"\\>"]=$2; next }
{
    for (old in map) {
        new = map[old]
        gsub(old,new)
    }
    print
}
' "$namefile" "$phylofile"

but it's hard to say exactly what you need without some sample input and expected output.

The above uses GNU awk for word boundaries.

Given your newly posted sample input files:

$ cat namefile
crambine orocrambus
swallowtail papilio
noctuid catocala

$ cat phylofile
(((swallowtail,noctuid):90,pyraloid):74,crambine)

Here's is the awk script running on them:

$ awk '
NR==FNR{ map["\\<"$1"\\>"]=$2; next }
{
    for (old in map) {
        new = map[old]
        gsub(old,new)
    }
    print
}
' namefile phylofile
(((papilio,catocala):90,pyraloid):74,orocrambus)
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • Thank you, but this script doesn't give what I need. In the file: (((swallowtail,noctuid):90,pyraloid):74,crambine) I want to replace the names as follows: crambine orocrambus swallowtail papilio noctuid catocala The resulting file would be (((papilio,catocala):90,pyraloid):74,orocrambus) I hope it is more clear like that – jibbah Sep 24 '15 at 08:59
  • OK, and in what way does that script not `give what you need`? Wrong output, no output, core dump, error message, something else? Don't try to put formatted text in comments, edit your question to includ representative input (contents of namefile and phylofile) and the expected output. – Ed Morton Sep 24 '15 at 12:08
  • the script seems to work (no error message), but it does not return any output. – jibbah Sep 24 '15 at 14:21
  • Oh, yes it was missing a `print` statement which I've added now. As you see, you should never just say a proposed solution "doesnt work" - always be clear and specific about in what way it doesn't work so we can help you figure out how to fix it. – Ed Morton Sep 24 '15 at 15:04
  • thank you Ed. I now get an output but only the first name of the list (crambine) has been changed. Maybe it is because of the hard return (I edited my question), but without hard return it still returns a file where only the first name have been replaced – jibbah Sep 24 '15 at 15:43
  • I've edited my question to SHOW the tool producing the expected output from your posted sample input. Are you sure you're running gawk? Execute `awk --version` to verify. I've no idea what `the hard return` means. – Ed Morton Sep 24 '15 at 15:48
  • 1
    well... I'm ashamed to admit it wasn't installed (but I used awk in other scripts and it was working, didn't know I should install gawk). It is now working! Thank you for your time! – jibbah Sep 24 '15 at 16:08