4

I've read numerous awk & sed liners to try and perform what I need and none seem to work. What I am trying to do is simply count the number of CR/LF breaks in my Linux file. This file has plain LF newlines as well, I just need to know how many true records I'm importing.

One such awk line I've tried is awk '/^M$/{n++}; END { print n+0 }' my_file or some such. This did not work. Any help would be great. I'm not an awk guru so please go easy.

Martin Tournoij
  • 26,737
  • 24
  • 105
  • 146
jiveturkey
  • 2,484
  • 1
  • 23
  • 41

3 Answers3

8

Using GNU awk, which supports multi-character Record Separator:

awk -v RS='\r\n' 'END{print NR}' file

This sets the record separator to \r\n and prints the total number of records.

For example:

$ echo $'record 1\r\nrecord\n2\r\nrecord 3' > file
$ awk -v RS='\r\n' 'END{print NR}' file
3

To those that think this answer is incorrect, let me propose another example. Consider the file:

bash-4.2$ cat -vet file
line 1$
line 2$
line 3bash-4.2$

(shell prompts intentionally left in to show the end of the file)

With normal UNIX line endings and with no newline at the end of the file. How many records are there in this file? Personally, I would say that there are 3. However, there are only two newline characters.

Tom Fenech
  • 72,334
  • 12
  • 107
  • 141
  • 1
    Your sample output gives 3 when there are only 2 CRLF in your file (the last line is terminated by LF, inserted by echo) – ComputerDruid Dec 15 '14 at 19:14
  • 1
    @ComputerDruid I guess that it depends whether the OP is interested in counting characters, or counting the number of records. – Tom Fenech Dec 15 '14 at 19:15
  • A newline is expected at the end of files as convention. even `wc -l` won't count the last line if there's no newline at the end of it. See http://stackoverflow.com/a/7741505/276093 – matt burns Jun 04 '15 at 12:15
6

You can use this grep to count all the lines ending with CR/LF:

grep -c $'\r$' file

Pattern $'\r$' will match only those lines that are ending with \r\n and -c will give you count of those lines.

anubhava
  • 761,203
  • 64
  • 569
  • 643
3

Modern dos2unix utility is able to count number of CR/LF lines:

Example output:

$ dos2unix -i *.txt
 6       0       0  no_bom    text    dos.txt
 0       6       0  no_bom    text    unix.txt
 0       0       6  no_bom    text    mac.txt
 6       6       6  no_bom    text    mixed.txt
50       0       0  UTF-16LE  text    utf16le.txt
 0      50       0  no_bom    text    utf8unix.txt
50       0       0  UTF-8     text    utf8dos.txt
 2     418     219  no_bom    binary  dos2unix.exe

It's number of DOS line breaks, number of Unix line breaks, number of Mac line breaks, byte order mark, text or binary, file name.

gavenkoa
  • 45,285
  • 19
  • 251
  • 303