4

I'm trying to format cut, paste output but sed not working.

file.txt

Apple
Banana
Apple
Banana
Orange
Apple
Orange

code.sh

cut -f2 file.txt | sort | uniq | sed 's/^\|$/#/g'| paste -sd,\& -

expected output / output on ubuntu

#Apple#,#Banana#&#Orange#

getting output / output on macos

Apple,Banana&Orange

Note: The code works on Ubuntu, but on MacOS it doesn't.

4 Answers4

2

This can be done in a single gnu-awk:

awk '!seen[$1]++{} END {
    PROCINFO["sorted_in"]="@ind_str_asc"
    for (i in seen)
      s = s (s == "" ? "" : (++j==1?",":"&")) "#" i "#"
    print s
}' file

#Apple#,#Banana#&#Orange#

On OSX I have gnu awk installed via home brew.

anubhava
  • 761,203
  • 64
  • 569
  • 643
1

As far as I know, BSD/Mac sed doesn't support \|. See sed not giving me correct substitute operation for newline with Mac - differences between GNU sed and BSD / OSX sed for details.

As an alternate, you can use ERE instead of BRE. I checked it on Linux, apparently this still doesn't seem to work on Mac (See also: MacOS sed: match either beginning or end).

$ echo 'Apple' | sed -E 's/^|$/#/g'
#Apple#

# workaround for Mac
$ echo 'Apple' | sed -e 's/^/#/' -e 's/$/#/'
#Apple#

Instead of sort+uniq+sed, you can also use awk (but note that awk solution shown here removes duplicates while preserving original order, doesn't sort the input):

$ awk '!seen[$0]++{print "#" $0 "#"}' ip.txt
#Apple#
#Banana#
#Orange#

Change $0 to $2 if you want only the second field, based on your use of cut

Sundeep
  • 23,246
  • 2
  • 28
  • 103
1

As mentioned elsewhere, BSD sed doesn't support \|. Instead of replacing ^ and $, you can substitute # around the whole line.

sort -u file.txt | sed 's/.*/#&#/' | paste -sd,'&' -
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Thanks for `sort -u` but does it work with ubuntu? –  Feb 20 '21 at 06:47
  • 1
    It should work everywhere, it's a standard `sort` option that long predates all the different flavors of Unix. – Barmar Feb 20 '21 at 06:50
  • I realised that your solution is properly working, but tell me one thing why you replaced `$` with `&` in `s/.*/#/`. With `$` it doesn't work. –  Feb 20 '21 at 07:08
  • `&` is replaced with whatever the regexp matched, which is the whole line. – Barmar Feb 20 '21 at 07:14
  • `$` only has meaning in the regexp, not the replacement. – Barmar Feb 20 '21 at 07:16
  • 1
    Right, `-u` is a required argument for POSIX `sort`, see https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html, and has been present in every sort I remember using for the past 40+ years. – Ed Morton Feb 20 '21 at 18:07
0

A simple way to do it using the sed command:

sed -E 's/[[:alnum:]]+/#&#/'
  • the -E option for enabling the POSIX ERE (extended regular
    expression)
  • [[:alnum:]]+ The alphanumeric characters; in ASCII, equivalent to [A-Za-z0-9] with the plus (+) to refer to one or more.
  • the & symbol, does bring or refer to the content of the pattern we found. (on which we surrounded it with #)
Ayoub_Prog
  • 196
  • 2
  • 11