4

How do I use map with the split function to trim the constituents: $a, $b, $c and $d; of $line?

my ($a, $b, $c, $d, $e) = split(/\t/, $line);

# Perl trim function to remove whitespace from the start and end of the string
sub trim($)
{
    my $string = shift;
    $string =~ s/^\s+//;
    $string =~ s/\s+$//;
    return $string;
}
syker
  • 10,912
  • 16
  • 56
  • 68

6 Answers6

4

Don't use prototypes the ($) on your function unless you need them.

my ( $a, $b, $c, $d, $e ) =
  map {s/^\s+|\s+$//g; $_}    ## Notice the `, $_` this is common
  , split(/\t/, $line, 5)
;

Don't forget in the above s/// returns the replacement count -- not $_. So, we do that explicitly.

or more simply:

my @values = map {s/^\s+|\s+$//g; $_}, split(/\t/, $line, 5), $line
Evan Carroll
  • 78,363
  • 46
  • 261
  • 468
  • I don't know why the downvote, but you've forgotten the g at the end in the final line. –  Jul 02 '10 at 03:44
  • Why do you recommend against prototypes? – mleykamp Jul 02 '10 at 16:07
  • Read this: http://stackoverflow.com/questions/297034/why-are-perl-function-prototypes-bad , add to it, that no one else does, and they add line noise. You only need them if your trying to create a level of sugar that gives you a different non-perlish look or feel. They don't really handle much else, and they don't work at all on methods. – Evan Carroll Jul 02 '10 at 16:18
  • 1
    I suggest `map { s/^\s+|\s+$//gr } split('\t', $line)` usable from [perl 5.14](http://www.perl.com/pub/2011/05/new-features-of-perl-514-non-destructive-substitution.html) – user2291758 Mar 05 '15 at 17:11
3

map takes two inputs:

  • an expression or block: this would be the trim expression (you don't have to write your own -- it's on CPAN)
  • and a list to operate on: this should be split's output:
use String::Util 'trim';
my @values = map { trim($_) } split /\t/, $line;
Ether
  • 53,118
  • 13
  • 86
  • 159
  • 1
    I'm nervous about introducing a dependence on a module which says "Final version. As of this version String::Util is no longer under development or being supported." –  Jul 02 '10 at 03:05
  • If we are going to install a CPAN module, we might as well use the one that does the job the best: [`String::Strip`](http://p3rl.org/String::Strip). See http://www.illusori.co.uk/perl/2010/03/05/advanced_benchmark_analysis_1.html – daxim Jul 02 '10 at 09:23
  • I haven't tried it, but the acid test for these kinds of modules is whether they strip out things like Unicode 0x3000 from the string. If not then maybe it is not a good replacement. Glancing at the source code, String::Strip uses the C function `isspace` to strip spaces and has no awareness of unicode, so it will behave differently from the above. –  Jul 02 '10 at 09:58
2

This should work:

my ($a, $b, $c, $d, $e) = map {trim ($_)} (split(/\t/, $line));

By the way, it's a minor point, but you should not use $a and $b as variable names.

1

You can also use "foreach" here.

foreach my $i ($a, $b, $c, $d, $e) {
  $i=trim($i);
}
Alexandr Ciornii
  • 7,346
  • 1
  • 25
  • 29
0

Just for variety:

my @trimmed = grep { s/^\s*|\s*$//g } split /\t/, $line;

grep acts as a filter on lists. This is why the \s+s need to be changed to \s*s inside the regex. Forcing matches on 0 or more spaces prevents grep from filtering out items in the list that have no leading or trailing spaces.

Zaid
  • 36,680
  • 16
  • 86
  • 155
  • But it wouldn't include segments that were surrounded by tabs with no spaces. `"\tspoon\t"` would be omitted. – Axeman Jul 02 '10 at 13:01
  • @Axeman : From [`perlretut`](http://perldoc.perl.org/perlretut.html): "`\s` matches a whitespace character, the set `[\ \t\r\n\f]` and others." Besides, aren't we splitting on `\t` here ;)? – Zaid Jul 02 '10 at 13:24
  • yes--but never mind, my eyes replaced `\s*` with my usual `\s+`. So the subst always matches and I don't know what I'm talking about. :D – Axeman Jul 02 '10 at 16:16
0

When I trim a string, I don't often want to keep the original. It would be nice to have the abstraction of a sub but also not have to fuss with temporary values.

It turns out that we can do just this, as perlsub explains:

Any arguments passed in show up in the array @_. Therefore, if you called a function with two arguments, those would be stored in $_[0] and $_[1]. The array @_ is a local array, but its elements are aliases for the actual scalar parameters. In particular, if an element $_[0] is updated, the corresponding argument is updated (or an error occurs if it is not updatable).

In your case, trim becomes

sub trim {
  for (@_) {
    s/^ \s+  //x;
    s/  \s+ $//x;
  }
  wantarray ? @_ : $_[0];
}

Remember that map and for are cousins, so with the loop in trim, you no longer need map. For example

my $line = "1\t 2\t3 \t 4 \t  5  \n";    
my ($a, $b, $c, $d, $e) = split(/\t/, $line);    

print "BEFORE: [", join("] [" => $a, $b, $c, $d), "]\n";
trim $a, $b, $c, $d;
print "AFTER:  [", join("] [" => $a, $b, $c, $d), "]\n";

Output:

BEFORE: [1] [ 2] [3 ] [ 4 ]
AFTER:  [1] [2] [3] [4]
Greg Bacon
  • 134,834
  • 32
  • 188
  • 245