0

I have a string like this:

<div class="fsxl fwb">Myname<br />

So how to get string Myname ?
here is my code:

public string name(string link)
    {
        WebClient client = new WebClient();
        string htmlCode = client.DownloadString(link);


        var output = htmlCode.Split("<div class="fsxl fwb">","<br />");

        return output.ToString();
    }

But the problem is "<div class="fsxl fwb">" it will become 2 string "<div class=", ">" and fsxl fwb so how to fix it ?

dandan78
  • 13,328
  • 13
  • 64
  • 78

5 Answers5

0

You can solve this by parsing the HTML, that is often the best option.

A quick solution would be to use regex to get the string out. This one will do:

<div class="fsxl fwb">(.*?)<br \/>

It will capture the input between the div and the first following <br />.

This will be the C# code to get the answer:

string s = Regex.Replace
           ( "(.*)<div class=\"fsxl fwb\">Myname<br />"
           , "<div class=\"fsxl fwb\">(.*?)<br \\/>(.*)"
           , "$2"
           );
Console.WriteLine(s);
Patrick Hofman
  • 153,850
  • 22
  • 249
  • 325
0

Here is a quick fix to your code:

var output = htmlCode.Split(
    new [] { "<div class=\"fsxl fwb\">", "<br />"},
    StringSplitOptions.RemoveEmptyEntries);

return output[0];

It escapes the quotes correctly and uses a valid override of the Split method.

Yacoub Massad
  • 27,509
  • 2
  • 36
  • 62
0
var a = @"<div class='fsxl fwb'>Myname<br />";
var b = Regex.Match(a, "(?<=>)(.*)(?=<)");
Console.WriteLine(b.Value);

Code based on: C# Get string between 2 HTML-tags

Community
  • 1
  • 1
Tony Silva
  • 21
  • 3
  • 1
    :o http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Oliver Feb 25 '16 at 12:48
  • You have changed the actual HTML to make it compiling in C#? How would that every match do you think? – Patrick Hofman Feb 25 '16 at 12:52
  • If you identified the question as a duplicate you shouldn't duplicate an answer. I can understand that due to your low rep you cannot act by commenting or casting a close-as-duplicate vote. – Filburt Feb 25 '16 at 12:55
0

Using regular expressions:

public string name(string link)
    {
        WebClient client = new WebClient();
        string htmlCode = client.DownloadString(link);


        Regex regex = new Regex("<div class=\"fsxl fwb\">(.*)<br />");
        Match match = regex.Match(htmlCode);
        string output = match.Groups[1].ToString();

        return output;
    }
Boomit
  • 147
  • 1
  • 6
0

If you want to avoid regex, you can use this extension method to grab the text between two other strings:

public static string ExtractBetween(this string str, string startTag, string endTag, bool inclusive)
    {
        string rtn = null;
        var s = str.IndexOf(startTag);
        if (s >= 0)
        {
            if (!inclusive)
            {
                s += startTag.Length;
            }

            var e = str.IndexOf(endTag, s);
            if (e > s)
            {
                if (inclusive)
                {
                    e += startTag.Length +1;
                }
                rtn = str.Substring(s, e - s);
            }
        }
        return rtn;
    }

Example usage (note you need to add the escape characters to your string)

var s = "<div class=\"fsxl fwb\">Myname<br />";
var r = s.ExtractBetween("<div class=\"fsxl fwb\">", "<br />", false);
Console.WriteLine(r);
Jay
  • 2,077
  • 5
  • 24
  • 39