0

I have a list of Url in file url.data like this

http://site1.org/info.php
http://site2.com/info/index.php
http://site3.edu/

I load in array of string with link function

string[] asUrlData = File.ReadAllLines("url.data").Where(s => !string.IsNullOrEmpty(s)) 
                                                  .Distinct().
                                                  .ToArray();

I want to get the left parts of Uris in the array like

http://site1.org/
http://site2.com/info/
http://site3.edu/

Is there any way to do this using LINQ?

LeMoussel
  • 5,290
  • 12
  • 69
  • 122

2 Answers2

2

You can use the URI class. Use IsWellFormedUriString to check if it is well formed and strUri.Substring(0, strUri.LastIndexOf('/') +1 to get the authority + path without file.

String[] uris = File.ReadLines(path)
            .Where(u => Uri.IsWellFormedUriString(u, UriKind.Absolute))
            .Select(u => { 
                var p = new Uri(u).ToString();
                return p.Substring(0, p.LastIndexOf('/') +1); 
            })
            .Distinct()
            .ToArray();

Console.Write(String.Join(Environment.NewLine, uris));

Edit: Here's a demo: http://ideone.com/UckoV

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • It's OK. Thank you for your help. Subsidiary question: is it better to use `var uris` that `string[] uris` ? – LeMoussel Aug 22 '12 at 12:40
  • @PapyRef: I've used var here to be able to change the query easily. Normally it's better to tell explicitely what the query returns (for readability). http://stackoverflow.com/questions/41479/use-of-var-keyword-in-c-sharp Remember to accept answers :) – Tim Schmelter Aug 22 '12 at 12:48
  • I am a new user (and French ;) ) What do you mean by _Remember to accept answers_ ? – LeMoussel Aug 22 '12 at 14:13
0

Tim Schmelter posted good solution, but i came up with another that uses regex

It might be better, if you wont to easily manipulate of output URL form.

string[] urls2 = urls
                .Select(s => Regex.Match(s, @"(http://){0,1}[a-z0-9\-\.]{1,}\.[a-z]{2,5}", RegexOptions.IgnoreCase).ToString())
                .Where(s => !string.IsNullOrEmpty(s))
                .ToArray();

If regex will be a string taken from config file etc, you can easily change it

DEMO : http://ideone.com/nRR0m

P.S. @Tim Schmelter: very nice page for those demos, added to favorites ;)

FixMyJava
  • 33
  • 4
T.G
  • 1,913
  • 1
  • 16
  • 29