39

I have a script that is appending new fields to an existing CSV, however ^M characters are appearing at the end of the old lines so the new fields end up on a new row instead of the same one. How do I remove ^M characters from a CSV file using Perl?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Alex Wong
  • 761
  • 3
  • 9
  • 15
  • Use `binmode(STDIN, ":crlf")` or `PERLIO=:unix:crlf` (see [http://stackoverflow.com/a/21320709/424632]). – musiphil Jan 23 '14 at 22:35

11 Answers11

52

^M is carriage return. You can do this:

$str =~ s/\r//g
Can Berk Güder
  • 109,922
  • 25
  • 130
  • 137
28

Or a 1-liner:

perl -p -i -e 's/\r\n$/\n/g' file1.txt file2.txt ... filen.txt
JDrago
  • 2,079
  • 14
  • 15
15

You found out you can also do this:

$line=~ tr/\015//d;
serenesat
  • 4,611
  • 10
  • 37
  • 53
Alex Wong
  • 761
  • 3
  • 9
  • 15
  • 1
    not as readable as `\r` - anyone looking at that (or yourself in a year's time) would be glad of a comment stating what it does – plusplus Jun 21 '11 at 08:12
8

Slightly unrelated, but to remove ^M from the command line using Perl, do this:

perl -p -i -e "s/\r\n/\n/g" file.name
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Roy Rico
  • 3,683
  • 6
  • 35
  • 36
6

I prefer a more general solution that will work with either DOS or Unix input. Assuming the input is from STDIN:

while (defined(my $ln = <>))
  {
    chomp($ln);
    chop($ln) if ($ln =~ m/\r$/);

    # filter and write
  }
KillerRabbit
  • 173
  • 1
  • 8
3

This one liner replaces all the ^M characters:

dos2unix <file-name>

You can call this from inside Perl or directly on your Unix prompt.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Akhil
  • 31
  • 1
2

To convert DOS style to UNIX style line endings:

for ($line in <FILEHANDLE>) {
   $line =~ s/\r\n$/\n/;
}

Or, to remove UNIX and/or DOS style line endings:

for ($line in <FILEHANDLE>) {
   $line =~ s/\r?\n$//;
}
spoulson
  • 21,335
  • 15
  • 77
  • 102
1

This is what solved my problem. ^M is a carriage return, and it can be easily avoided in a Perl script.

while(<INPUTFILE>)
{
     chomp;
     chop($_) if ($_ =~ m/\r$/);
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
user3274263
  • 41
  • 1
  • 11
0

Little script I have for that. A modification of it helped to filter out some other non-printable characters in cross-platform legacy files.

#!/usr/bin/perl
# run this as
# convert_dos2unix.pl < input_file > output_file
undef $/;
$_ = <>;
s/\r//ge;
print;
0

perl command to convert dos line ending to unix line ending with backup of the original file:

perl -pi.bak -e 's/\r\n/\n/g' filename

This command generates filename with unix line ending and leaves the original file as filename.bak.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
-1

In vi hit :.

Then s/Control-VControl-M//g.

Control-V Control-M are obviously those keys. Don't spell it out.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
  • 1
    It's a bad idea to include non-printing characters like carriage return verbatim in source code like this. Far better to use the \r escape that is (a) easy to see and (b) won't get lost if the source is reformatted. – Denis Howe Dec 02 '15 at 17:16