unix - count of columns in file

Question

Given a file with data like this (i.e. stores.dat file)

sid|storeNo|latitude|longitude
2|1|-28.03720000|153.42921670
9|2|-33.85090000|151.03274200

What would be a command to output the number of column names?

i.e. In the example above it would be 4. (number of pipe characters + 1 in the first line)

I was thinking something like:

awk '{ FS = "|" } ; { print NF}' stores.dat

but it returns all lines instead of just the first and for the first line it returns 1 instead of 4

score 138 · Accepted Answer · answered Dec 25 '11 at 11:14

138

awk -F'|' '{print NF; exit}' stores.dat

Just quit right after the first line.

answered Dec 25 '11 at 11:14

Mat

202,337
40
393
406

4

or `awk -F'|' 'NR==1{print NF}' stores.dat` – jaypal singh Dec 25 '11 at 11:36
12

@JaypalSingh: that will read the whole file - no need for that, better stop early. – Mat Dec 25 '11 at 11:39
They both seem to return the same correct ouput, is there any performance benefit of 1 over the other (or some other benefit)? – toop Dec 25 '11 at 11:39
3

@toop: yes, see my previous comment. My version will only read one block from the file, Jaypal's will read the whole file. – Mat Dec 25 '11 at 11:40
1

@Mat You are spot on! @loop Mat is absolutely correct. `exit` is the right way to go about it. No need to read the whole file if you only wish to know the number of columns. +1 :) – jaypal singh Dec 25 '11 at 11:49
It might not matter if your file is 10 lines long. But mine is about 1 million. Definitely going with Mat's answer. – Buttle Butkus Jul 15 '18 at 02:46

miku · Answer 2 · 2011-12-25T11:18:10.750

42

This is a workaround (for me: I don't use awk very often):

Display the first row of the file containing the data, replace all pipes with newlines and then count the lines:

$ head -1 stores.dat | tr '|' '\n' | wc -l

edited Dec 25 '11 at 11:18

answered Dec 25 '11 at 11:13

miku

181,842
47
306
310

10

For files with maaany columns (think SNP data) this is the way to go. Mat's solution returned "awk: program limit exceeded: maximum number of fields size=32767." – The Unfun Cat Oct 21 '13 at 12:09

score 14 · Answer 3 · answered Dec 25 '11 at 11:11

14

Unless you're using spaces in there, you should be able to use | wc -w on the first line.

wc is "Word Count", which simply counts the words in the input file. If you send only one line, it'll tell you the amount of columns.

answered Dec 25 '11 at 11:11

Tom van der Woerdt

29,532
7
72
105

I tried: head -1 stores.dat | wc -w But that does not return what I am after – toop Dec 25 '11 at 11:13
That's because you're not replacing the `|` with a space - `wc` will count words, which have to be space-separated. Use `head -1 stores.dat | tr '|' ' ' | wc -w` – Tom van der Woerdt Dec 25 '11 at 11:17
3

Please add your comment to the answer for the sake of completeness. – Xofo Sep 28 '16 at 17:01

score 7 · Answer 4 · answered Oct 23 '13 at 09:43

7

You could try

cat FILE | awk '{print NF}'

answered Oct 23 '13 at 09:43

Cat Kerr

149
2
7

Chris Koknat · Answer 5 · 2015-10-13T16:21:22.253

2

Perl solution similar to Mat's awk solution:

perl -F'\|' -lane 'print $#F+1; exit' stores.dat

I've tested this on a file with 1000000 columns.

If the field separator is whitespace (one or more spaces or tabs) instead of a pipe:

perl -lane 'print $#F+1; exit' stores.dat

edited Oct 13 '15 at 16:21

answered Sep 10 '15 at 19:24

Chris Koknat

3,305
2
29
30

score 1 · Answer 6 · edited Sep 11 '13 at 21:53

1

This is usually what I use for counting the number of fields:

head -n 1 file.name | awk -F'|' '{print NF; exit}'

edited Sep 11 '13 at 21:53

Chris

8,527
10
34
51

answered Sep 11 '13 at 21:33

user2533399

11
1

score 1 · Answer 7 · answered Oct 04 '17 at 16:05

1

select any row in the file (in the example below, it's the 2nd row) and count the number of columns, where the delimiter is a space:

sed -n 2p text_file.dat | tr ' ' '\n' | wc -l

answered Oct 04 '17 at 16:05

sAguinaga

638
13
31

F. Hauri - Give Up GitHub · Answer 8 · 2022-06-24T13:11:47.070

Proper pure bash way

Simply counting columns in file

Under bash, you could simply:

IFS=\| read -ra headline <stores.dat
echo ${#headline[@]}
4

A lot quicker as without forks, and reusable as $headline hold the full head line. You could, for sample:

printf " - %s\n" "${headline[@]}"
 - sid
 - storeNo
 - latitude
 - longitude

Nota This syntax will drive correctly spaces and others characters in column names.

Alternative: strong binary checking for max columns on each rows

What if some row do contain some extra columns?

This command will search for bigger line, counting separators:

tr -dc $'\n|' <stores.dat |wc -L
3

If there are max 3 separators, then there are 4 fields... Or if you consider:

each separator (|) is prepended by a Before and followed by an After, trimed to 1 letter by word:

tr -dc $'\n|' <stores.dat|sed 's/./b&a/g;s/ab/a/g;s/[^ab]//g'|wc -L
4

Counting columns in a CSV file

Under bash, you may use csv loadable plugins:

enable -f /usr/lib/bash/csv csv
IFS= read -r line <file.csv
csv -a fields <<<"$line"
echo ${#fields[@]}
4

For more infos, see How to parse a CSV file in Bash?.

score 1 · Answer 9 · edited Dec 25 '11 at 11:17

1

If you have python installed you could try:

python -c 'import sys;f=open(sys.argv[1]);print len(f.readline().split("|"))' \
    stores.dat

edited Dec 25 '11 at 11:17

miku

181,842
47
306
310

answered Dec 25 '11 at 11:16

Don Question

11,227
5
36
54

in this particular case, it's shorter to read from standard input `cat x.txt | python -c "print raw_input().count('|') + 1"` – Lie Ryan Dec 25 '11 at 11:22
shorter yes, but not faster, if there are many long files! I assumed he wanted a faster solution in the eye of pure (means surely big) data files. – Don Question Dec 25 '11 at 11:30

score 0 · Answer 10 · answered Feb 23 '16 at 15:43

0

Based on Cat Kerr response. This command is working on solaris

awk '{print NF; exit}' stores.dat

answered Feb 23 '16 at 15:43

Manu Mitra

9
2

And then you're at the accepted answer minus the proper field separator. This would return "1" for the example input. – Benjamin W. Feb 23 '16 at 16:07
This is essentially the same as accepted answer without field separator, as Bejamin says returns 1 but should work for space delimited files. – discipulus Sep 29 '16 at 05:45

score 0 · Answer 11 · edited Nov 30 '16 at 13:39

0

you may try:

head -1 stores.dat | grep -o \|  | wc -l

edited Nov 30 '16 at 13:39

frhd

9,396
5
24
41

answered Nov 30 '16 at 13:36

user7231103

1

unix - count of columns in file

11 Answers11

Proper pure bash way

Simply counting columns in file

Alternative: strong binary checking for max columns on each rows

Counting columns in a CSV file

Linked