-1

Is there a simple way to do a find and replace of values in a file, using another file as input for the values that need to be re-labelled? E.g. I have a tab-delimited file1 with two columns:

a|1|b|C|:1-10(-)      A1
a|2|b|E|:2-11(+)      A2
a_b|3|b|C|:300-302(-)      A3
a|5|b|C|:4-60(+)      A1
a|7|b|D|:71-72(-)      A11

where column 1 (everything before the tab) contains the original name, and column 2 contains the new name. I would like to feed this into a file2, to replace all occurrences of the values. I know that using

sed -i -e 's/a|1|b|C|:1-10(-)/A1/g' file2.txt

will do this one line at a time, but is there a way to just feed in file1 such that all the values in file2 will get re-labelled at once?

gizmo
  • 151
  • 1
  • 7
  • 1
    This is a very common FAQ, though I was unable to quickly find a suitable duplicate. `sed 's%\([^\t]*\)\t\([^\t]*\)%s_\1_\2_%' file1 | sed -i -f - file2.txt` will create a simple `sed` script out of the first file, and apply it to the second. (Using `sed` to write a `sed` script might feel mystical at first, but you can figure it out.) You may need to tweak the syntax slightly if your `sed` does not accept `\t` for tab, and/or doesn't want the grouping parentheses to be backslashed. – tripleee Jan 11 '16 at 07:53
  • The tab and backslashes seem to be fine, but I'm getting the following error: `sed: 1: "-": invalid command code -`. Is that a MacOS error? – gizmo Jan 11 '16 at 08:18
  • `-i` requires a filename on OSX. – 123 Jan 11 '16 at 08:30
  • Hmm...changing `-i` to `-i.bu` changes the error to: `sed: -: No such file or directory` – gizmo Jan 11 '16 at 08:33
  • Running just the first half `sed 's%\([^\t]*\)\t\([^\t]*\)%s_\1_\2_%' file1.txt` outputs the contents of file1.txt, as if I'd run `cat file1.txt`. Is that what it should be doing? – gizmo Jan 11 '16 at 08:38
  • The first column e.g. `a|1|b|C|:1-10(-)` is to be interpreted as plain text, or regex? – anishsane Jan 11 '16 at 09:07
  • No it should transform the (from)(tab)(to) into a `sed` substitution command; `s_(from)_(to)_` – tripleee Jan 11 '16 at 09:25
  • Apparently your `sed` cannot read a script from standard input. Store the output in a temporary file, and run `sed` with that. `sed s/from/to/ file1.txt >/tmp/ick; sed -i.bu -f/tmp/ick file2.txt` – tripleee Jan 11 '16 at 09:26

1 Answers1

0

It's a bit hacky, but I do this

C0243321@IKHCPKISBN0084S ~/examples/perl
$ cat file1
xx1     zz1
xx2     zz2

C0243321@IKHCPKISBN0084S ~/examples/perl
$ cat file2
xx1, xx2

C0243321@IKHCPKISBN0084S ~/examples/perl
$ perl -e 'undef $/;$file1=<ARGV>;$file2=<ARGV>;@lines=split(/\n/,$file1);for(@lines){@fields=split(/\t/);$file2=~s/$fields[0]/$fields[1]/g;}print$file2' file1 file2
zz1, zz2
Essex Boy
  • 7,565
  • 2
  • 21
  • 24
  • Thanks, using your example it works, but when I try to use my file1 and file2 I get the error: `Quantifier follows nothing in regex; marked by <-- HERE in m/a|2|b|E|:2-11(+ <-- HERE )/ at -e line 1, <> chunk 2.` :( The file2 I'm using is `(a|1|b|C|:1-10(-)),(a|7|b|D|:71-72(-))` – gizmo Jan 11 '16 at 08:59
  • The plus sign (and the parentheses) are metacharacters in Perl's regex dialect, and will need to be backslashed in order for this to work. But as already suggested, finding a good duplicate with multiple answers is probably more fruitful than writing yet another quick and dirty answer and then having to debug it. – tripleee Jan 11 '16 at 09:28
  • You can put a quotemeta function around the regex after the split that will automatically escape anything. – Essex Boy Jan 14 '16 at 07:11