2

Today my wish is to take text form the string. This string must be, between last slash and .partX.rar or .rar

First I tried to find edge's end of the word and then the beginning. After I get that two elements I merged them but I got empty results.

String:

http://hosting.xyz/1234/15-game.part4.rar.html

http://hosting.xyz/1234/16-game.rar.html

Regex:

Begin:(([^/]*)$) - start from last /

End:(.*(?=.part[0-9]+.rar|.rar)) stop before .partX.rar or .rar

As you see, if I merge that codes I won't get any result. What is more, "end" select me only .partX instead of .partX.rar

All what I want is: 15-game.part4.rar and 16-game.rar


What i tried:

(([^/]*)$)(.*(?=.part[0-9]+.rar|.rar))

(([^/]*)$)

(.*(?=.part[0-9]+.rar|.rar))

I tried also

/[a-zA-Z0-9]+

but I do not know how select symbols.. This could be the easiest way. But this select only letters and numbers, not - or _. If I could select symbols..

anubhava
  • 761,203
  • 64
  • 569
  • 643
deadfish
  • 11,996
  • 12
  • 87
  • 136
  • He posted (at least) something that he tried. – Sergio Tulentsev Jan 04 '12 at 19:23
  • as I said `(([^/]*)$)(.*(?=.part[0-9]+.rar|.rar))` this I tried, I tried also `(([^/]*)$)` and `(.*(?=.part[0-9]+.rar|.rar))`, I tried also `/[a-zA-Z0-9]+` but I do not know how select symbols.. This could be the easiest way. But this select only letters and numbers, not `-` - do you know how to select till `.` dot? – deadfish Jan 04 '12 at 19:24
  • What language are you working within? Java/Javascript/C#? – ΩmegaMan Jan 04 '12 at 19:38

4 Answers4

1

Nothing could be simpler! :-)

Use this:

new Regex("^.*\/(.*)\.html$")

You'll find your filename in the first captured group (don't have a c# compiler at hand, so can't give you working sample, but you have a working regex now! :-) )

See a demo here: http://rubular.com/r/UxFNtJenyF

Sergio Tulentsev
  • 226,338
  • 43
  • 373
  • 367
1

You don't really need a regex for this as you can merely split the url on / and then grab the part of the file name that you need. Since you didn't mention a language, here's an implementation in Perl:

use strict;
use warnings;

my $str1="http://hosting.xyz/1234/15-game.part4.rar.html";

my $str2="http://hosting.xyz/1234/16-game.rar.html";

my $file1=(split(/\//,$str1))[-1]; #last element of the resulting array from splitting on slash

my $file2=(split(/\//,$str2))[-1];

foreach($file1,$file2)
{
  s/\.html$//; #for each file name, if it ends in ".html", get rid of that ending.
  print "$_\n";
}

The output is:

15-game.part4.rar
16-game.rar
0

I'm not a C# coder so can't write full code here but I think you'll need support of negative lookahead here like this:

new Regex("/(?!.*/)(.+?)\.html$");

Matched Group # 1 will have your string i.e. "16-game.rar" OR "15-game.part4.rar"

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • could I ask you to explain me this regex command, I would appreciate for this. Thanks in advance! – deadfish Jan 05 '12 at 14:59
  • 1
    Sure I will try. This regex basically has a negative lookahead `/(?!.*/)` which means match a `/` which is **NOT followed** by some text and another `/`. This basically makes sure that we **match only last /**. Rest is straight forward where `(.+?)\.html` matches a text that has `.html` in the end. – anubhava Jan 05 '12 at 15:05
0

Use two regexes:

  • start to substitute .*/ with nothing;
  • then substitute \.html with nothing.

Job done!

fge
  • 119,121
  • 33
  • 254
  • 329