2

I am working with a long file called test that looks as follows:

AHAP   USA|NIS00333|+NULL|NISGOOGLE|NIS00005|*binary|NISCAR
KJJLIL123124%|NIS00160|+NULL|NISFACEBOOK|NIS00006|*binary|NISBUR
ASFASS9992|NIS00164|+NULL|NISTABLE|NIS00008|*binary|NISFANCY

I need to make a replacement the string "NIS" to "NIX", however I need to achieve this only in the second column that is delimited by the pipe character, the separator of my data is the pipe "|", and I have several columns, seven in total, I just want to do the replacement in the second one.

I tried:

$ sed s/NIS/NIX/g test
AHAP   USA|NIX00333|+NULL|NIXGOOGLE|NIX00005|*binary|NIXCAR
KJJLIL123124%|NIX00160|+NULL|NIXFACEBOOK|NIX00006|*binary|NIXBUR
ASFASS9992|NIX00164|+NULL|NIXTABLE|NIX00008|*binary|NIXFANCY

But it is affecting all the columns that match with the string: NIS and change it to NIX, I just want to affect the second column, my desired output would be:

AHAP   USA|NIX00333|+NULL|NISGOOGLE|NIS00005|*binary|NISCAR
KJJLIL123124%|NIX00160|+NULL|NISFACEBOOK|NIS00006|*binary|NISBUR
ASFASS9992|NIX00164|+NULL|NISTABLE|NIS00008|*binary|NISFANCY

I really appreciate help with this issue, thanks any how.

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
neo33
  • 1,809
  • 5
  • 18
  • 41

2 Answers2

3

If you are having column problems, do use awk to have a better, native control of them:

$ awk 'BEGIN {FS=OFS="|"}{gsub("NIS","NIX",$2)}1' file
AHAP   USA|NIX00333|+NULL|NISGOOGLE|NIS00005|*binary|NISCAR
KJJLIL123124%|NIX00160|+NULL|NISFACEBOOK|NIS00006|*binary|NISBUR
ASFASS9992|NIX00164|+NULL|NISTABLE|NIS00008|*binary|NISFANCY

This performs a gsub() replacement on the 2nd |-based field. After this replacement is done, 1 triggers the default action for awk, consisting in printing $0, which holds the full (updated) record.

fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • Thanks, I really appreciate the suggestion I will consider to use awk instead sed to handle columns, just one more question what is the equivalent to the replacement in place of awk, I mean if you are using sed you can use sed -i to achive the change in the same file, do you know what would be the equivalent of awk? – neo33 Aug 16 '16 at 15:13
  • @neo33 you can use `-i inplace` from GNU awk 4.1.0. Otherwise, the trick is always `awk '...' file > tmp_file && mv tmp_file file`. All of this is described in [awk save modifications inplace](http://stackoverflow.com/a/16529730/1983854). – fedorqui Aug 16 '16 at 15:16
  • 1
    Yes I see thanks a lot for the support, this was really helpful. – neo33 Aug 16 '16 at 15:20
2

A sed solution:

$ sed 's/^\([^|]*|[^|]*\)NIS/\1NIX/' infile 
AHAP   USA|NIX00333|+NULL|NISGOOGLE|NIS00005|*binary|NISCAR
KJJLIL123124%|NIX00160|+NULL|NISFACEBOOK|NIS00006|*binary|NISBUR
ASFASS9992|NIX00164|+NULL|NISTABLE|NIS00008|*binary|NISFANCY

The regex, split up:

^          # Start of line anchor
\(         # Start of capture gruop
    [^|]*  # Characters other than pipe - first column
    |      # Column separator between first and second column
    [^|]*  # Characters other than pipe - first part of second column
\)         # End of capture group
NIS        # What we actually want to replace

This has a limitation in that it only replaces the first occurrence of NIS in the second column. The example input doesn't have any more, but if it did, we could use conditional branching to repeat the substitution as long as it changes the pattern space:

sed '
:a
s/^\([^|]*|[^|]*\)NIS/\1NIX/
ta' infile

:a is the label to jump to, and ta is the conditional branching command ("jump to :a if the substitution did something").

As a one-liner:

sed ':a;s/^\([^|]*|[^|]*\)NIS/\1NIX/;ta' infile

BSD sed (as found in Mac OS) would complain about the label not being followed by a newline, so we could rewrite as

sed -e ':a' -e 's/^\([^|]*|[^|]*\)NIS/\1NIX/;ta' infile
Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
  • Wow this is really helpful my first approach was to think also in a sed, however this is a nice way to use regular expressions, although for this particular task is a little bit easier to use awk I mean to handle columns, – neo33 Aug 16 '16 at 17:20
  • 1
    @neo33 Let me just say it myself before Ed comes along: in 99.9% of all cases, awk is faster, more powerful and more concise than sed. Some say, sed should be ignored as it has been superseded by awk - some (like me) like to dabble with for mostly nostalgic reasons ;) – Benjamin W. Aug 16 '16 at 17:31
  • Yes I understand, maybe due to the fact that sed is a little bit more popular than awk however definitely is a very good idea to know more about awk, thanks for the suggestion I appreciate the support. – neo33 Aug 16 '16 at 19:06