use awk to print a column, adding a comma

Question

I have a file, from which I want to retrieve the first column, and add a comma between each value.

Example:

AAAA 12345 xccvbn
BBBB 43431 fkodks
CCCC 51234 plafad

to obtain

AAAA,BBBB,CCCC

I decided to use awk, so I did

awk '{ $1=$1","; print $1 }'

Problem is: this add a comma also on the last value, which is not what I want to achieve, and also I get a space between values.

How do I remove the comma on the last element, and how do I remove the space? Spent 20 minutes looking at the manual without luck.

you might find this useful: http://stackoverflow.com/questions/8714355/bash-turning-multi-line-string-into-single-comma-separated — jbub, Jun 10 '14 at 10:32
No. Adding a comma with awk and then piping it to sed to remove the comma is a ridiculous approach. Just don't add the comma. — Ed Morton, Jun 10 '14 at 13:30
Possible duplicate of [Bash turning multi-line string into single comma-separated](https://stackoverflow.com/questions/8714355/bash-turning-multi-line-string-into-single-comma-separated) — kvantour, Oct 17 '18 at 12:59
Possibly related, for anyone who finds this helpful: if you set `-F,` to make your input field separator a comma (or whatever else you like, mutatis mutandis), a `BEGIN{OFS=FS}` block will set the output field separator to the same. — Marcel Besixdouze, Sep 02 '22 at 14:31

Ed Morton · Answer 1 · 2014-06-10T15:03:12.027

22

$ awk '{printf "%s%s",sep,$1; sep=","} END{print ""}' file
AAAA,BBBB,CCCC

or if you prefer:

$ awk '{printf "%s%s",(NR>1?",":""),$1} END{print ""}' file
AAAA,BBBB,CCCC

or if you like golf and don't mind it being inefficient for large files:

$ awk '{r=r s $1;s=","} END{print r}' file
AAAA,BBBB,CCCC

edited Jun 10 '14 at 15:03

answered Jun 10 '14 at 13:32

Ed Morton

188,023
17
78
185

2

Wow, this is very bright: defining `sep` after the first `printf` makes it not appear first time. – fedorqui Jun 10 '14 at 14:05
Some golfing `awk '{printf (NR>1?",":"")"%s",$1} END{print ""}'` – Jotne Jun 10 '14 at 14:09
3

If you want to play golf then `awk '{a=a s$1;s=","} END{print a}'` but IMHO its getting less clear and it's less efficient for large files. – Ed Morton Jun 10 '14 at 14:18

score 17 · Answer 2 · answered May 19 '20 at 13:24

17

awk {'print $1","$2","$3'} file_name

This is the shortest I know

answered May 19 '20 at 13:24

swapnil shashank

877
8
11

Perfect. simple and short. – Arif Jun 01 '21 at 11:14
Here, the OP wants the `$1` from different rows. Also, this will have a whitespace on either side of the `,` – Kaushik Oct 07 '22 at 16:36

Jotne · Answer 3 · 2014-06-10T13:54:21.623

3

Why make it complicated :) (as long as file is not too large)

awk '{a=NR==1?$1:a","$1} END {print a}' file
AAAA,BBBB,CCCC

For better porability.

awk '{a=(NR>1?a",":"")$1} END {print a}' file

edited Jun 10 '14 at 13:54

answered Jun 10 '14 at 13:41

Jotne

40,548
12
51
55

1

To answer your question: It's probably fine but for large files that'd be significantly slower than printing each line as you go due to the string concatenation operation being slow and the size of the string you'd be building up. You also have a bit of redundancy there by specifying `$1` twice, and the non-parenthesized ternary operator might fail on some awks. – Ed Morton Jun 10 '14 at 13:45
@EdMorton Agree, but OP does not say anything about file size. – Jotne Jun 10 '14 at 13:48
1

Right, hence "it's probably fine". FWIW `'{a=a (NR>1?",":"") $1} END {print a}'` would resolve the redundancy and portability issues. – Ed Morton Jun 10 '14 at 13:50
@EdMorton I did see that I could take `$1` out form the test, but then I needed parentheses (that should be used anyway) and need to add `""` makes it longer. – Jotne Jun 10 '14 at 13:53

Nils-o-mat · Answer 4 · 2014-06-10T11:05:34.853

2

You can do this:

awk 'a++{printf ","}{printf "%s", $1}' file

a++ is interpreted as a condition. In the first row its value is 0, so the comma is not added.

EDIT: If you want a newline, you have to add END{printf "\n"}. If you have problems reading in the file, you can also try:

cat file | awk 'a++{printf ","}{printf "%s", $1}'

edited Jun 10 '14 at 11:05

answered Jun 10 '14 at 11:00

Nils-o-mat

1,132
17
31

Consider the difference between `a` and `NR`. Adding a newline is `print ""`. Why would you have problems reading a file? – Ed Morton Jun 10 '14 at 13:36
I don't get what you mean with your first sentence? But maybe it would be more elegant using NR as condition. And thanks for the newline - hint. – Nils-o-mat Jun 11 '14 at 12:19
Right, it's just that you don't need a separate variable to count record numbers since `NR` is already provided. – Ed Morton Jun 11 '14 at 13:46

score 2 · Answer 5 · answered Jun 10 '14 at 12:51

2

awk 'NR==1{printf "%s",$1;next;}{printf "%s%s",",",$1;}' input.txt

It says: If it is first line only print first field, for the other lines first print , then print first field.

Output:

AAAA,BBBB,CCCC

answered Jun 10 '14 at 12:51

a5hk

7,532
3
26
40

1

+1: Though `awk '{printf "%s",(NR==1?$1:","$1)}END{print ""}' file` might be more appropriate. – jaypal singh Jun 10 '14 at 13:27
This does not add new line to the end. I do not see OP says anything about it. – Jotne Jun 10 '14 at 13:44

score 2 · Answer 6 · answered Oct 17 '18 at 13:54

2

In this case, as simple cut and paste solution

cut -d" " -f1 file | paste -s -d,

answered Oct 17 '18 at 13:54

kvantour

25,269
4
47
72

Thanks, this sounds simple. Although this doesn't use `awk` as stated in the query. – susenj Dec 30 '19 at 06:15

score 1 · Answer 7 · answered Feb 06 '19 at 03:57

1

In case somebody as me wants to use awk for cleaning docker images:

docker image ls | grep tag_name | awk '{print $1":"$2}'

answered Feb 06 '19 at 03:57

Oleg Neumyvakin

9,706
3
58
62

score 1 · Answer 8 · answered Jun 27 '20 at 00:51

Surpised that no one is using OFS (output field separator). Here is probably the simplest solution that sticks with awk and works on Linux and Mac: use "-v OFS=," to output in comma as delimiter:

$ echo '1:2:3:4' | awk -F: -v OFS=, '{print $1, $2, $4, $3}' generates: 1,2,4,3

It works for multiple char too: $ echo '1:2:3:4' | awk -F: -v OFS=., '{print $1, $2, $4, $3}' outputs: 1.,2.,4.,3

score 0 · Answer 9 · answered Feb 06 '19 at 14:02

0

Using Perl

$ cat group_col.txt
AAAA 12345 xccvbn
BBBB 43431 fkodks
CCCC 51234 plafad

$ perl -lane ' push(@x,$F[0]); END { print join(",",@x) } ' group_col.txt
AAAA,BBBB,CCCC

$

answered Feb 06 '19 at 14:02

stack0114106

8,534
3
13
38

score 0 · Answer 10 · answered Feb 11 '20 at 19:50

0

This can be very simple like this: awk -F',' '{print $1","$1","$2","$3}' inputFile

where input file is : 1,2,3 2,3,4 etc.

answered Feb 11 '20 at 19:50

Syed Raihan

1

score 0 · Answer 11 · answered Apr 28 '21 at 14:50

I used the following, because it lists the api-resource names with it, which is useful, if you want to access it directly. I also use a label "application" to find specific apps in a namespace:

kubectl -n ops-tools get $(kubectl api-resources --no-headers=true --sort-by=name | awk '{printf "%s%s",sep,$1; sep=","}') -l app.kubernetes.io/instance=application

use awk to print a column, adding a comma

11 Answers11

Linked