16

I have a file, from which I want to retrieve the first column, and add a comma between each value.

Example:

AAAA 12345 xccvbn
BBBB 43431 fkodks
CCCC 51234 plafad

to obtain

AAAA,BBBB,CCCC

I decided to use awk, so I did

awk '{ $1=$1","; print $1 }'

Problem is: this add a comma also on the last value, which is not what I want to achieve, and also I get a space between values.

How do I remove the comma on the last element, and how do I remove the space? Spent 20 minutes looking at the manual without luck.

  • 3
    you might find this useful: http://stackoverflow.com/questions/8714355/bash-turning-multi-line-string-into-single-comma-separated – jbub Jun 10 '14 at 10:32
  • No. Adding a comma with awk and then piping it to sed to remove the comma is a ridiculous approach. Just don't add the comma. – Ed Morton Jun 10 '14 at 13:30
  • Possible duplicate of [Bash turning multi-line string into single comma-separated](https://stackoverflow.com/questions/8714355/bash-turning-multi-line-string-into-single-comma-separated) – kvantour Oct 17 '18 at 12:59
  • Possibly related, for anyone who finds this helpful: if you set `-F,` to make your input field separator a comma (or whatever else you like, mutatis mutandis), a `BEGIN{OFS=FS}` block will set the output field separator to the same. – Marcel Besixdouze Sep 02 '22 at 14:31

11 Answers11

22
$ awk '{printf "%s%s",sep,$1; sep=","} END{print ""}' file
AAAA,BBBB,CCCC

or if you prefer:

$ awk '{printf "%s%s",(NR>1?",":""),$1} END{print ""}' file
AAAA,BBBB,CCCC

or if you like golf and don't mind it being inefficient for large files:

$ awk '{r=r s $1;s=","} END{print r}' file
AAAA,BBBB,CCCC
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • 2
    Wow, this is very bright: defining `sep` after the first `printf` makes it not appear first time. – fedorqui Jun 10 '14 at 14:05
  • Some golfing `awk '{printf (NR>1?",":"")"%s",$1} END{print ""}'` – Jotne Jun 10 '14 at 14:09
  • 3
    If you want to play golf then `awk '{a=a s$1;s=","} END{print a}'` but IMHO its getting less clear and it's less efficient for large files. – Ed Morton Jun 10 '14 at 14:18
17
awk {'print $1","$2","$3'} file_name

This is the shortest I know

swapnil shashank
  • 877
  • 8
  • 11
3

Why make it complicated :) (as long as file is not too large)

awk '{a=NR==1?$1:a","$1} END {print a}' file
AAAA,BBBB,CCCC

For better porability.

awk '{a=(NR>1?a",":"")$1} END {print a}' file
Jotne
  • 40,548
  • 12
  • 51
  • 55
  • 1
    To answer your question: It's probably fine but for large files that'd be significantly slower than printing each line as you go due to the string concatenation operation being slow and the size of the string you'd be building up. You also have a bit of redundancy there by specifying `$1` twice, and the non-parenthesized ternary operator might fail on some awks. – Ed Morton Jun 10 '14 at 13:45
  • @EdMorton Agree, but OP does not say anything about file size. – Jotne Jun 10 '14 at 13:48
  • 1
    Right, hence "it's probably fine". FWIW `'{a=a (NR>1?",":"") $1} END {print a}'` would resolve the redundancy and portability issues. – Ed Morton Jun 10 '14 at 13:50
  • @EdMorton I did see that I could take `$1` out form the test, but then I needed parentheses (that should be used anyway) and need to add `""` makes it longer. – Jotne Jun 10 '14 at 13:53
2

You can do this:

awk 'a++{printf ","}{printf "%s", $1}' file

a++ is interpreted as a condition. In the first row its value is 0, so the comma is not added.

EDIT: If you want a newline, you have to add END{printf "\n"}. If you have problems reading in the file, you can also try:

cat file | awk 'a++{printf ","}{printf "%s", $1}'
Nils-o-mat
  • 1,132
  • 17
  • 31
  • Consider the difference between `a` and `NR`. Adding a newline is `print ""`. Why would you have problems reading a file? – Ed Morton Jun 10 '14 at 13:36
  • I don't get what you mean with your first sentence? But maybe it would be more elegant using NR as condition. And thanks for the newline - hint. – Nils-o-mat Jun 11 '14 at 12:19
  • Right, it's just that you don't need a separate variable to count record numbers since `NR` is already provided. – Ed Morton Jun 11 '14 at 13:46
2
awk 'NR==1{printf "%s",$1;next;}{printf "%s%s",",",$1;}' input.txt

It says: If it is first line only print first field, for the other lines first print , then print first field.

Output:

AAAA,BBBB,CCCC
a5hk
  • 7,532
  • 3
  • 26
  • 40
2

In this case, as simple cut and paste solution

cut -d" " -f1 file | paste -s -d,
kvantour
  • 25,269
  • 4
  • 47
  • 72
1

In case somebody as me wants to use awk for cleaning docker images:

docker image ls | grep tag_name | awk '{print $1":"$2}'
Oleg Neumyvakin
  • 9,706
  • 3
  • 58
  • 62
1

Surpised that no one is using OFS (output field separator). Here is probably the simplest solution that sticks with awk and works on Linux and Mac: use "-v OFS=," to output in comma as delimiter:

$ echo '1:2:3:4' | awk -F: -v OFS=, '{print $1, $2, $4, $3}' generates: 1,2,4,3

It works for multiple char too: $ echo '1:2:3:4' | awk -F: -v OFS=., '{print $1, $2, $4, $3}' outputs: 1.,2.,4.,3

HAltos
  • 701
  • 7
  • 6
0

Using Perl

$ cat group_col.txt
AAAA 12345 xccvbn
BBBB 43431 fkodks
CCCC 51234 plafad

$ perl -lane ' push(@x,$F[0]); END { print join(",",@x) } ' group_col.txt
AAAA,BBBB,CCCC

$
stack0114106
  • 8,534
  • 3
  • 13
  • 38
0

This can be very simple like this: awk -F',' '{print $1","$1","$2","$3}' inputFile

where input file is : 1,2,3 2,3,4 etc.

0

I used the following, because it lists the api-resource names with it, which is useful, if you want to access it directly. I also use a label "application" to find specific apps in a namespace:

kubectl -n ops-tools get $(kubectl api-resources --no-headers=true --sort-by=name | awk '{printf "%s%s",sep,$1; sep=","}') -l app.kubernetes.io/instance=application