-3

I have 2 different text formats here.

"Submitted on Oct 1st, 2013"
"Not started" 

I want to strip the status and the date.

Expected result is:

$status = "Submitted" or "Not started"
$date = "Oct 1st, 2013"

How to do it in Perl. Many thanks.

井R3Naiz0
  • 102
  • 5
novo
  • 356
  • 1
  • 6
  • 17
  • How would `$date` get initialized if the status is "Not Started"? That is, isn't there a date only if `$status` is "Submitted"? – Kenosis Oct 05 '13 at 01:12
  • I see you included the `html` tag with your question. If you are planning to do much in the way of parsing HTML with Perl, you might want to read [this answer](http://stackoverflow.com/a/13249455/1497596). – DavidRR Oct 05 '13 at 02:59
  • 1
    If you have *no idea at all* how to approach this problem, then you need a from-basics Perl tutorial. Use Google to find one on the internet; there are many. If you know enough Perl to attempt it then please do so and show your code. We will help you fix it. Stack Overflow exists to help experienced programmers when they are struggling, not as a teaching service or a source free code. – Borodin Oct 05 '13 at 07:53
  • you misunderstood my question. I wanted to ask for which tool to look it up and do it myself. It's the way to learn more efficiently. I didn't ask for codes specifically. There are bunches of different way to approach a problem. Don't judge other people. I think you would done the same thing to learn a new programming language by yourself. Thanks. – novo Oct 05 '13 at 20:18

2 Answers2

1

If you can assume that there is always the word "on" before the date, here's the code that will do the thing.

#!/usr/bin/perl

use strict;
use warnings;

chomp(my $input = <STDIN>);

my $status = "Not started";
my $date;

if ($input =~ / on /) {
    $date = $';
    $status = "Submitted";
}

print "Status: $status\n";
if (defined $date) {
    print "Date: $date\n";
}
NigoroJr
  • 1,056
  • 1
  • 14
  • 31
1

An approach that begins with a single RegEx. Handles unexpected inputs.

#!/usr/bin/perl -w

use strict;
use warnings;

my ($match, $status, $date);
foreach (<DATA>) {

    $_ =~ /^"(Submitted)(?: on )(.*)"|(Not started)"/;

    #         ^^^^^^^^^          ^^    ^^^^^^^^^^^
    #            $1              $2        $3

    if (defined $1) {
        ($match, $status, $date) = ("Y", $1, $2);
    } elsif (defined $3) {
        ($match, $status, $date) = ("Y", $3, "-");
    } else {
        ($match, $status, $date) = ("N", "-", "-");
    }

    print "[", join("][", ($match, $status, $date)), "]\n";
}

__DATA__
"Submitted on Oct 1st, 2013"
"Not a match!"
"Not started"

This program produces the output:

[Y][Submitted][Oct 1st, 2013]
[N][-][-]
[Y][Not started][-]
DavidRR
  • 18,291
  • 25
  • 109
  • 191