-2

I have a list of items in a file,

foobar
barfoo
bar
faaboo
foo
boofar
fo
b

Using perl, I'm just after script that will go through the filename and delete all items 3 characters or less. Overwrite the existing filename (without creating a new, or temp filename), Thus the list will be become.

foobar
barfoo
faaboo
boofar
cHao
  • 84,970
  • 20
  • 145
  • 172
user349418
  • 2,107
  • 4
  • 18
  • 14
  • 5
    What have you tried? Is there a specific point that is creating a problem or do you just want someone to give you a ready-made script? – Sinan Ünür Jan 01 '11 at 22:48
  • I suppose using perl to call out to the system `ed` program oughtn’t count. ☺ – tchrist Jan 02 '11 at 17:17
  • @tchrist - `ed` is for wusses. You call out to emacs and tell emacs to do this (i am kind of hoping emacs has such non-interactive capabilities) – DVK Jan 04 '11 at 16:43

2 Answers2

8

One-liner:

perl -ine '{print if /.{4}/}' filename

You can use length (add 1 for newline character) instead of regex if that's your fancy, as Jonathan Leffler noted in comments - it's probably faster on very large files. Here's a Windows version (note the use of double quotes required by cmd instead of single quotes):

perl.exe -i.bak -n -e "{print if length > 4}" filename

Also, to answer your comment, unfortunately you can not execute in-place -i edits on Windows without a backup file. Please refer to this SO post for detailed explanation (again, Windows limitation, not Perl's) as well as a workaround.

Community
  • 1
  • 1
DVK
  • 126,886
  • 32
  • 213
  • 327
  • 1
    C:\Users\user>perl -ine '{/.{4}/}' list.txt Can't open perl script "'{/.{4}/}'": No such file or directory – user349418 Jan 01 '11 at 22:55
  • 1
    If it's Windows, you MUST use double quotes instead of single (`cmd` shell issue, nothing to do with Perl): `perl -i.bak -n -e "{print if /.{4}/}" x.txt` - please note that this version actually just ran on my ActiveState Perl successfully :) – DVK Jan 01 '11 at 22:57
  • 2
    @user You should be familiar enough with your operating system to realize that whereas standard Unix shells are perfectly happy with single quotation marks in this context, with `cmd.exe`, you must use double quotation marks as in: `perl -i.bak -ne "{print if /.{4}/}" filename` – Sinan Ünür Jan 01 '11 at 22:59
  • Anything wrong with `length` instead of a regex? One possible objection - length would include the newline. But `perl -ine 'print if length > 4'` does the required job, as does `perl -ine "print if length > 3 + 1"` (avoiding the problems on DOS, and leaving the 3 visible). – Jonathan Leffler Jan 01 '11 at 23:37
  • is it possible to avoid the -i.bak backup file? – user349418 Jan 01 '11 at 23:43
  • @user349418 It is indeed, just leave out the -i.bak bit. That way your original file will be overwritten (which seems to be what you want). Btw, that's something you could have easily tried out yourself! – canavanin Jan 01 '11 at 23:55
  • 2
    @user349418 - it's needed. I updated the answer with the link to complete explanation as well as actual Windows command – DVK Jan 02 '11 at 00:00
  • 1
    @canavanin - removing `-i.bak` will NOT do in-place edit like the OP wanted, and instead print the resulting text to standard out. – DVK Jan 02 '11 at 00:02
  • The `length` version will be faster, though not noticeably so unless your file is gigantic. – Rob N Jan 02 '11 at 00:06
  • @DVK Oh, yes, you are right! Still you can stop backup file generation if you just put -i instead of -i.bak. (ok, just read your edit... If you say it cannot be done in Windows then I'll just have to believe you, as I can't try that out :) ) – canavanin Jan 02 '11 at 00:09
  • @canavanin I did remove it, but it just printed the file to the screen without making any changes to the file – user349418 Jan 02 '11 at 00:18
4

Tie::File

use warnings;
use strict;
use Tie::File;

my $file = shift;
tie my @array, 'Tie::File', $file or die;
@array = grep { length > 3 } @array;
untie @array;
toolic
  • 57,801
  • 17
  • 75
  • 117