0

I'm french, sorry for my bad English. Search is an input field

IEnumerator GetHtml()
{
    WWW www = new WWW("https://www.random-site.com/" + search.text); // Get the html of the site
    yield return www; // Wait for the end of the operation
    if (www.text.Contains("No search results were found for"))
    {
        Debug.LogError("Aucun résultat pour \"" + search.text + "\".");
        StopCoroutine(AllMusic());
    }
    string[] str;
    str = Extract(www.text, "<li class=\"a-class\">", "</li>"); // Extract string between "<li class="a-class">"  and "</li>"
    File.WriteAllLines(@"azaaac.txt", str); // Debug only
}  

string[] Extract(string data, string startString, string endString)
{
    try
    {
        Regex regex = new Regex("(?<=" + startString + ")(.*?)(?=" + endString + ")"); // Regex patern
        MatchCollection matches = regex.Matches(data); // Apply the patern

        List<string> res = new List<string>();
        foreach(Match m in matches) // Convert a match collection to a list of strings
            res.Add(m.ToString());
        return res.ToArray();
    } catch (Exception e)
    {
        Debug.LogError("Erreur lors de l'extraction : " + e); // A sentence in french ^^
        return new string[] { };
    }
}

The string[] Extract doesn't work with an HTML code... because if

data="<li class=\"a-class\">aaaa</li> ab <li class=\"a-class\">aaaa</li>" 

extract returns { "aaaa", "aaaa" }. So, the problem comes from the html code.. I tried to apply the HttUtility.HtmlDecode, but it doesn't work...

eLRuLL
  • 18,488
  • 9
  • 73
  • 99
Aiixu
  • 29
  • 9
  • 3
    Isn't that exactly what it is supposed to return? What should the outcome be given a certain input? Please give a couple of examples. – rene Jan 02 '18 at 13:01
  • You can check this solutions .May help you [https://stackoverflow.com/questions/5066517/regex-starts-with-and-ending-with](https://stackoverflow.com/questions/5066517/regex-starts-with-and-ending-with) – mohammed besher Jan 02 '18 at 13:09
  • Rene > Yes but when it's an html page code, MathCollection return null, an there isn't any error – Aiixu Jan 02 '18 at 13:23
  • Mohammed Besher > It's doesn't work :/ – Aiixu Jan 02 '18 at 13:23
  • Don't use regex with HTML. HTML is not a regular language. Use HtmlAgilityPack instead to parse the HTML. –  Jan 02 '18 at 14:06
  • Amy, could you explain me how it's work ? – Aiixu Jan 03 '18 at 13:02

0 Answers0