4

I'm just starting to learn Perl. I need to parse JavaScript file. I came up with the following subroutine, to do it:

sub __settings {
    my ($_s) = @_;
    my $f = $config_directory . "/authentic-theme/settings.js";
    if ( -r $f ) {
        for (
            split(
                '\n',
                $s = do {
                    local $/ = undef;
                    open my $fh, "<", $f;
                    <$fh>;
                    }
            )
            )
        {
            if ( index( $_, '//' ) == -1
                && ( my @m = $_ =~ /(?:$_s\s*=\s*(.*))/g ) )
            {
                my $m = join( '\n', @m );
                $m =~ s/[\'\;]//g;
                return $m;
            }
        }
    }
}

I have the following regex, that removes ' and ; from the string:

s/[\'\;]//g;

It works alright but if there is a mentioned chars (' and ;) in string - then they are also removed. This is undesirable and that's where I stuck as it gets a bit more complicated for me and I'm not sure how to change the regex above correctly to only:

  1. Remove only first ' in string
  2. Remove only last ' in string
  3. Remove ont last ; in string if exists

Any help, please?

Ilia Ross
  • 13,086
  • 11
  • 53
  • 88
  • matching expression is: 1. `^[^']*(')` 2. `(')[^']*$` 3.`;$` but I'm not familiar with Perl syntax to help further, I imagine for # 1 & # 2 this will cause the whole match to be removed, maybe you can find something that removes only the matched group 1, for # 3 it will work. – Mystic Odin May 11 '15 at 09:27
  • Single-character identifiers make your code much more difficult to follow – Borodin May 11 '15 at 10:00
  • That is annoying Sublie Text formater for Perl. Any other you could recommend please? – Ilia Ross May 11 '15 at 10:01

4 Answers4

3

You can use the following to match:

^'|';?$|;$

And replace with '' (empty string)

See DEMO

karthik manchala
  • 13,492
  • 1
  • 31
  • 55
2

Remove only first ' in string

Remove only last ' in string

^[^']*\K'|'(?=[^']*$)

Try this .See demo.

https://regex101.com/r/oF9hR9/8

Remove ont last ; in string if exists

;(?=[^;]*$)

Try this.See demo.

https://regex101.com/r/oF9hR9/9

All three in one

^[^']*\K'|'(?=[^']*$)|;(?=[^;]*$)

See Here

Community
  • 1
  • 1
vks
  • 67,027
  • 10
  • 91
  • 124
  • ..and if I would want to replace later in string `'` with `\'` in case it's not already `\'`. This one doesn't work for me: `s/'/\'/g`? – Ilia Ross May 11 '15 at 10:02
  • No. This one works `s/\'/\\'/g` but it replaces those at the beginning and at the end. – Ilia Ross May 11 '15 at 10:03
  • `a = 'This is the test's string';`. Now on the output I need the final regex that would replace all unescaped `'` with `\'` but not at the beginning or end.. – Ilia Ross May 11 '15 at 10:05
  • 1
    This is totally awesome! :) Thanks, pal! – Ilia Ross May 11 '15 at 10:09
  • One remark: already escaped quotes will also get escaped: https://regex101.com/r/yW3oJ9/1. – Wiktor Stribiżew May 11 '15 at 10:13
  • @stribizhev cool!! ..but they are pre-processed already and there should be no such cases but better safe than sorry! :) – Ilia Ross May 11 '15 at 10:29
  • @vks awesome fix, really! :D – Ilia Ross May 11 '15 at 10:36
  • Do you know how to make this one `s/^[^']*'(*SKIP)(*F)|'[^']*$(*SKIP)(*F)|(?<!\\)'/\\'/gim` work on older versions of Perl, like 5.8? It fails on version < 5.10. I think `(SKIP)` part or `(*F)` makes it fail. – Ilia Ross May 25 '15 at 17:40
  • Returned error in Perl 5.8.8 is `Quantifier follows nothing in regex; marked by <-- HERE in m/^[^']'( <-- HERE SKIP)(F)|'[^']$(SKIP)(F)|(?<!\)'/` – Ilia Ross May 25 '15 at 17:42
  • Could you please try, cause I'm not fluent with Regex on this level? There are just few groups. – Ilia Ross May 25 '15 at 17:46
  • @IliaRostovtsev i would suggest you put up a new question...as this is complex and would need advanced features of perl :) – vks May 25 '15 at 17:53
  • ok! php :) It was Perl really.. But it's irrelevant I think.. I will do, in case I don't figure this out! Thanks, pal! – Ilia Ross May 25 '15 at 17:55
  • @IliaRostovtsev ahh sry....... actually it would require conditional replace....m a python guy...so i know in python...some perl guy can definitly do it.......do put up a question in case you dont find a solution :) – vks May 25 '15 at 17:56
2

You can use this code:

#!/usr/bin/perl
$str = "'string; 'inside' another;"; 
$str =~ s/^'|'?;?$//g;
print $str;

IDEONE demo

The main idea is to use anchors: ^ beginning of string, $ end of string and ;? matches the ";" symbol at the end only if it is present (? quantifier is making the pattern preceding it optional).
EDIT: Also, ; will get removed even if there is no preceding '.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thanks, I knew about those anchors. I just wasn't sure how to combine it all in one. – Ilia Ross May 11 '15 at 09:30
  • Ilya, you can just use [alternation operator](http://www.regular-expressions.info/alternation.html) with 2 parts. Also, look-arounds (as used in another answer are resource-consuming. My suggestion does not use look-arounds. – Wiktor Stribiżew May 11 '15 at 09:30
  • Well, I tested it and it seems that last `;` is not removed properly. Спасибо! – Ilia Ross May 11 '15 at 09:33
  • I've updated the code. Now, it will remove the `;` even if there is no `'`. – Wiktor Stribiżew May 11 '15 at 09:35
  • Yes, true. Because _Boolean_ values in JavaScript are now wrapped in `'`. That is cool! It seems to work the same way as _vks_'s example: `^[^']*\K'|'(?=[^']*$)|;(?=[^;]*$)` – Ilia Ross May 11 '15 at 09:40
  • 1
    Yes, it is true. There is one thing about `\K`: if you plan to migrate to another engine in future, it might stop working. My suggestion is more universal. See [\K regex support](http://stackoverflow.com/questions/13542950/support-of-k-in-regex) for more details. – Wiktor Stribiżew May 11 '15 at 09:42
  • `\K` thing seems a bit difficult for me to understand at the moment! _vks_'s answer seems more universal though! I just need to work more on regex in general! Thank you again! – Ilia Ross May 11 '15 at 09:45
  • @karthikmanchala: Yeah, I tested, a tiny bit (like ~10 steps). – Wiktor Stribiżew May 11 '15 at 09:47
  • @stribizhev .. a tiny bit.. yes.. but i don think we can specify number of steps here.. because steps proportional to length of string.. take big string.. and tiny bit will be ~100 steps.. :) – karthik manchala May 11 '15 at 09:50
  • @karthikmanchala: Anyway, it seems to me vks' solution is better than ours :) – Wiktor Stribiżew May 11 '15 at 09:56
  • @stribizhev .. no ours is better.. check it again.. :D – karthik manchala May 11 '15 at 10:01
  • I agree that `^[^']*\K'|'(?=[^']*$)|;(?=[^;]*$)` will not always remove *final* `'`: https://regex101.com/r/fU9qK4/1 – Wiktor Stribiżew May 11 '15 at 10:11
1

I suggest that your original code should look more like this. It is much more idiomatic Perl and I think more straightforward to follow

sub __settings {
    my ($_s) = @_;
    my $file = "$config_directory/authentic-theme/settings.js";
    return unless -r $file;

    open my $fh, '<', $file or die qq{Unable to open "$file" for input: $!};
    my @file = <$fh>;
    chomp @file;

    for ( @file ) {
        next if m{//};
        if ( my @matches = $_ =~ /(?:$_s\s*=\s*(.*))/g ) {
            my $matches = join "\n", @matches;
            $matches =~ tr/';//d;
            return $matches;
        }
    }
}
Borodin
  • 126,100
  • 9
  • 70
  • 144
  • Thanks and yes, I guess, but at the moment I'm doing Perl with my previous knowledge of other programming languages - interesting thing, that it works! :) I find your formatting easier to read! How can I format Perl in Sublime Text 3 like this? – Ilia Ross May 11 '15 at 10:14
  • @IliaRostovtsev: The main advantage is the meaningful variable names, the reduced nesting of blocks, and the additional whitespace. But the indentation can be achieved with `perltidy` which should have been included as part of your Perl installation – Borodin May 11 '15 at 10:16