6

I need to remove and of the line which looks like CR LF.

Coding - Windows-1250 Windows 7 EN

I have been trying to chomp, chomp, change \R to nothing change \r\n etc but nothing works...

Thank you in advance

use strict;
$/ = "\r\n";
open FILE , "<", "file.txt" or die $!;
while (<FILE>) {
    my @line = split /,/ , $_;

    foreach my $l (@line) {
        print $l;
    }
    sleep(1);
}
ruhungry
  • 4,506
  • 20
  • 54
  • 98
  • http://stackoverflow.com/questions/881779/neatest-way-to-remove-linebreaks-in-perl – M. Suleiman Mar 31 '13 at 17:54
  • 1
    You should write *how* you have been trying things. Apparently you are not doing something right. Show the code of what you have tried, including input and output. – TLP Mar 31 '13 at 17:59
  • I have added $/ but it blocks my script. I updated my post – ruhungry Mar 31 '13 at 18:01
  • 1
    I can't hear `chomp`ing. (In your code...) – Alois Mahdal Mar 31 '13 at 19:27
  • Note that you should normally use `local $/ = "\r\n";` in as limited a scope as feasible, rather than resetting it globally for everything. In this sample code, it is not critical, but as a good Perl coding practice, avoid modifying most of the global special variables without using `local`. – Jonathan Leffler Mar 31 '13 at 19:41

5 Answers5

14

First of all, you don't even try to change the CRLF to LF. You just print back out what you got.

On a Windows system, Perl adds the :crlf layer to your file handles. That means that CRLF gets changed to LF on read, and LF gets changed to CRLF on write.

That last bit is the problem. By default, Perl assumes you're create a text file, but what you're creating doesn't match the definition of a text file on Windows. As such, you need to switch your output to binmode.

Solution that only works on a Windows system:

use strict;
use warnings;

binmode(STDOUT);

open(my $fh, '<', 'file.txt') or die $!;
print while <$fh>;

Or if you want it to work on any system,

use strict;
use warnings;

binmode(STDOUT);

open(my $fh, '<', 'file.txt') or die $!;
while (<$fh>) { 
   s/\r?\n\z//;
   print "$_\n";
}

Without binmode on the input,

  • You'll get CRLF for CRLF on a non-Windows system.
  • You'll get LF for CRLF on a Windows system.
  • You'll get LF for LF on all systems.

s/\r?\n\z// handles all of those.

ikegami
  • 367,544
  • 15
  • 269
  • 518
  • it is working, but under linux i see \z as unrecognized escape. – Znik May 22 '20 at 11:25
  • @Znik, I don't know what you mean by "under linux". This question was about Perl, and `\z` matches end of string in Perl (regardless of OS). `$` matches end of string in some other languages with Perl-like regex syntax, but `$` doesn't match (just) end of string in Perl. – ikegami May 22 '20 at 12:35
  • I tryed test this code under some linux. perl is not independed on OS, and must respect OS behavior. on some linux distributions /z is working, on some not. It is usually repaired on frech perl version, but on older installed on older linuxes distro it is not working. – Znik Sep 21 '20 at 18:56
  • 2
    @Znik, It's `\z`, not `/z`. And you are quite wrong. Perl's regex engine is completely independent of OS. Aussi, c'est indépendant de la langue de la locale. Furthermore, `\z` has existed since at least 5.6 (20 years and 13 versions ago), when I first started using Perl. I suspect it's been in Perl since far longer, though. I don't know what you tried (cause you didn't show anything), but what you are saying is just plain untrue. Are you actually using `perl`, or something that has support for "Perl-like" regex? The latter doesn't mean `perl` is used, and this questions was tagged `perl`. – ikegami Sep 21 '20 at 19:40
2

if you are on Unix like command line, on the shell prompt the following with do the trick:

  • perl -pe 's/^M//g' file.txt # ^M mean control-M, press control-v control-M, the CRLF character
  • perl -pe 's#\r\n$#\n#g' file.txt
  • user1587276
    • 906
    • 6
    • 4
    • thanks for posting this, can you explain what the `#` character is doing here? I am not very familiar with Perl, only using it for this use-case – user5359531 Nov 01 '18 at 18:12
    1

    This works for me on a Mac (Mac OS X 10.7.5, Perl 5.16.2):

    #!/usr/bin/env perl
    use strict;
    use warnings;
    
    while (<>)
    {
        print "1: [$_]\n";
        {
            local $/ = "\r\n";
            chomp;
        }
        print "2: [$_]\n";
    }
    

    Sample output:

    $  odx x3.txt
    0x0000: 6F 6E 69 6F 6E 0D 0A 73 74 61 74 65 0D 0A 6D 69   onion..state..mi
    0x0010: 73 68 6D 61 73 68 0D 0A                           shmash..
    0x0018:
    $ perl x3.pl < x3.txt | vis -c
    1: [onion^M
    ]
    2: [onion]
    1: [state^M
    ]
    2: [state]
    1: [mishmash^M
    ]
    2: [mishmash]
    $
    

    The odx program gives me a hex dump of the data file; you can see that there are 0D 0A (CRLF) line endings. The vis -c program shows control characters (other than newline and tab) as ^M (for example). You can see that the raw input includes the ^M (lines starting 1:) but the chomp'd lines are missing both the newline and the carriage return.

    The only issue will be whether the input on Windows is a text file or a binary file. If it is a text file, the I/O system should do the CRLF mapping automatically. If it is a binary file, it won't. (Unix doesn't have a meaningful distinction between text and binary files.) On Windows, you may need to investigate binmode, as discussed on the open page.

    Jonathan Leffler
    • 730,956
    • 141
    • 904
    • 1,278
    • 1
      @ikegami: would you like to elaborate on why it won't work on Windows, other than Windows being different from Unix? – Jonathan Leffler Mar 31 '13 at 22:32
    • 1
      `$_` doesn't contains CRLF, so the `chomp` won't do anything. `print "\n";` to a text files puts CR LF in the text file. See my answer for more details. – ikegami Mar 31 '13 at 23:05
    0

    That would be a one-liner in Perl... Try the following under Linux:

    perl -0pe 's/[\r\n]//g' < file.txt
    sleep 1
    

    and the following under Windows:

    perl.exe -0pe "s/\015\012|\015|\012//g" < file.txt
    ping 1.1.1.1 -n 1 -w 1000 > nul
    
    Alois Mahdal
    • 10,763
    • 7
    • 51
    • 69
    Michael
    • 445
    • 4
    • 16
    0

    I think \s* should work.

    use strict;
    use warnings;
    
    open FILE , "<", "file.txt" or die $!;
    
    while ( my $line = <FILE> ) {
    
        $line =~ s{ \s* \z}{}xms;  # trim trailing whitespace of any kind
    
        my @columns = split /,/ , $line;
    
        for my $column (@columns) {
    
            print "$column ";
        }
        sleep(1);
    
        print "\n";
    }
    
    ddoxey
    • 2,013
    • 1
    • 18
    • 25