-2

Suppose i have array of strings as follows:

string[] array = new string[6];

array[0] = "http://www.s8wministries.org/general.php?id=35";
array[1] = "http://www.s8wministries.org/general.php?id=52";
array[2] = "http://www.ecogybiofuels.com/general.php?id=6";
array[3] = "http://www.stjohnsheriff.com/general.php?id=186";
array[4] = "http://www.stjohnsheriff.com/general.php?id=7";
array[5] = "http://www.bickellawfirm.com/general.php?id=1048";

Now I want to store only one similar occurrence of the string ie http://www.s8wministries.org/general.php?id=35 discarding any other string that has http://www.s8wministries.org and store it in another array.

Please how do I go about this?

my attempt is as follows:-

//remove similar string from array storing only one similar in another array

        foreach (var olu in array)
        {

            string findThisString = olu.ToString();
            string firstTen = findThisString.Substring(0, 15); 

            // See if substring is in the table.
            int index1 = Array.IndexOf(array, firstTen);  //substring is not in table

        }
  • I assume you've looked up how to use [substring](http://stackoverflow.com/questions/2902394/how-to-get-the-substring-in-c)? Can you show us what attempt you've made and where you've got stuck? – Krease Oct 05 '14 at 17:07
  • using substring is not working.Take a look at this my attempt:- //remove similar from array foreach (var olu in array) { string findThisString = olu.ToString(); string firstTen = findThisString.Substring(0, 15); // See if string is in the table. int index1 = Array.IndexOf(array, firstTen); } – james base Oct 05 '14 at 17:12
  • Best to add your code to the question rather than a comment - makes it much easier to read :) – Krease Oct 05 '14 at 17:15
  • Is the part that you need to check only the sub-domain/domain? Everything after the top-level domain (.com in your example) should be ignored in the comparison? – Guy Passy Oct 05 '14 at 17:19
  • i am only checking the sub-domain/domain just for the purpose of comparison,once one domain is found,i will store that in another array and discard all other domain with the same similarity in the array – james base Oct 05 '14 at 17:24
  • What is the basis for the term "Similar" in your question? – vikas Oct 05 '14 at 17:49

5 Answers5

0

Here is how I would approach this

  1. Initialize a hashtable or a dictionary for holding domain names
  2. Loop through each item
  3. Do a string split operation with using '', '.', '/' etc as delimiters - find out the domain by parsing the parts.
  4. Check if the domain name exists in the hashtable. If it does, discard the current entry. If it doesn't exist, insert into the hashtable and also add the current entry to a new list of your selected entries.

Another option would be to sort the entries alphabetically. Go through them one at a time. Select an entry with the domain name. Skip all the next entries with the same domain name. Select the next entry when the domain name changes again.

ArunGeorge
  • 495
  • 5
  • 11
0

Let's say the result is to be stored in an array called unique_array and that your current array is called array. Pseudo-code follows:

bool found = false;
for(int i = 0; i < array_size; i++)
{   if(array[i] starts with "http://www.s8wministries.org")
    {   if(found) continue;
        found = true;
    }
    add array[i] to end of unique_array;
}
mrk
  • 3,061
  • 1
  • 29
  • 34
0

try this with List of string, so you have list of string containing URL, you can use URI class to compare domains:

for(int i = 0; i < strList.Length; i++)
{   
  Uri uriToCompare = new Uri(strArray[i]);
  for(int j = i+1; j < strArray.Length; j++){
     Uri uri = new Uri(strArray[j]);
     if( uriToCompare.Host  == uri.Host){
        strList.RemoveAt(j);
     }     
  }
}
Zaheer Ahmed
  • 28,160
  • 11
  • 74
  • 110
  • This method you advocate,will it differentiate url :- http://www.s8wministries.org/general.php?id=35 from http://www.s8wministries.org/general.php?id=52 storing the former url and discarding the latter – james base Oct 05 '14 at 17:40
  • host property will compare `s8wministries.org`. give it a try and read msdn given link. – Zaheer Ahmed Oct 05 '14 at 18:24
0

I would go the way of slightly more automation by creating a class that inherits IEqualityComparer (utilizing the great answer to this question):

public class PropertyComparer<T> : IEqualityComparer<T>
{
    Func<T, T, bool> comparer;

    public PropertyComparer<T>(Func<T, T, bool> comparer)
    {
        this.comparer = comparer;
    }

    public bool Equals(T a, T b)
    {
        return comparer(a, b);
    }

    public int GetHashCode(T a)
    {
        return a.GetHashCode();
    }
}

Once you have that class - you can use Distinct like this:

var distinctArray = array.Select(s => new Uri(s)).Distinct(new PropertyComparer<Uri>((a, b) => a.Host == b.Host));

That leaves you with an array only containing distinct domains. It's an IEnumerable so you may want to .ToList() it or something, or revert it back to strings from Uris . But I think this method makes for much more readable code.

Community
  • 1
  • 1
Guy Passy
  • 694
  • 1
  • 9
  • 32
0

Please try below Code:

    string[] array = new string[6];
    array[0] = "http://www.s8wministries.org/general.php?id=35";
    array[1] = "http://www.s8wministries.org/general.php?id=52";
    array[2] = "http://www.ecogybiofuels.com/general.php?id=6";
    array[3] = "http://www.stjohnsheriff.com/general.php?id=186";
    array[4] = "http://www.stjohnsheriff.com/general.php?id=7";
    array[5] = "http://www.bickellawfirm.com/general.php?id=1048";
    var regex = @"http://www.[\w]+.[\w]+";
    var distList = new List<string>();
    var finalList = new List<string>();
    foreach (string str in array)
    {
        Match match = Regex.Match(str, regex, RegexOptions.IgnoreCase);
        if (match.Success)
        {
            var uniqueUrl = match.Groups[0].Value;
            if (!distList.Contains(uniqueUrl))
            {
                distList.Add(uniqueUrl);
                finalList.Add(str);
            }
        }
    }

Here finalList contains the required list of URLs

vikas
  • 931
  • 6
  • 11
  • Your methods works,but i will need the complete url as in http://www.s8wministries.org/general.php?id=35 instead of www.s8wministries.org.If only you can please help me modify the code to show the complete url.Thanks in advance – james base Oct 06 '14 at 02:22