-2

I have a little problem with getting items from webbrowser.document.

the part of code in document tha i need is this>

primary-text,7.gm2-body-2">**ineedthis.se**</div> <div jstcache="194" 

I need to parse the "ineedthis.se" that will be different every time.

my code is this

if (webBrowser1.ReadyState == WebBrowserReadyState.Complete)
                {
                   System.IO.StreamReader sr = new System.IO.StreamReader(webBrowser1.DocumentStream.ToString());
                    string rssourcecode = sr.ReadToEnd();
                     Regex r = new Regex("7.gm2-body-2 > </ div > < div ", RegexOptions.Multiline);
                    MatchCollection matches = r.Matches(rssourcecode);
                    // Dim r As New System.Text.RegularExpressions.Regex("here need splitersorsomething", RegexOptions.Multiline)
                    foreach (Match itemcode in matches)
                    {
                        listBox1.Items.Add(itemcode.Value.Split(here need splites).GetValue(2));
                    }
                   
                }

so. can you please help me with right splitters ? thanks a lot

  • Welcome to Stack Overflow. Please take the [tour] to learn how Stack Overflow works and read [ask] on how to improve the quality of your question. Then [edit] your question to include the source code you have as a [mcve], which can be compiled and tested by others. Also check the [help/on-topic] to see what questions you can ask. Please see: [Why is “Can someone help me?” not an actual question?](http://meta.stackoverflow.com/q/284236). – Progman Sep 06 '20 at 09:50
  • Also read [this](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) answer. – Luuk Sep 06 '20 at 09:51
  • Did you read the page from Luuk's comment? – Enigmativity Sep 06 '20 at 10:55

2 Answers2

-2

Your question was not very clear,still what I could understand is you want to get a part of a string(Substring) from a string.

Assuming you have a string value saved in variable:

string stringValue = primary-text,7.gm2-body-2">**ineedthis.se**</div> <div jstcache="194";

And you want to extract is: ineedthis.se

So you can try out stringValue.Substring(31,44);.

You can take reference from : https://www.c-sharpcorner.com/UploadFile/mahesh/substring-in-C-Sharp/

Sandeep Pandey
  • 184
  • 1
  • 17
  • 1
    it is more better to get index of ineedthis.se – Kaveesh Sep 06 '20 at 10:17
  • get the index and then put a substring on it – Sandeep Pandey Sep 06 '20 at 10:19
  • Thanks all for response. but this is not a solution. maybe i didn't axplain good my issue. in the webbrowser.document(HMTL) ia have a lot of code and this case we don't know where is located this string "primary-text,7.gm2-body-2">**ineedthis.se**
    – Replica Team Sep 06 '20 at 10:41
-2

Let's say your string is following

var str = "<div primary-text,7.gm2-body-2\">**some random text**</div> <div jstcache=\"194\""; 

The below is a very rudimentary solution but will help you in your answer

var foundStart = false;
var foundEnd = false;
var startIndex = -1;
var length = 0;
for(var index = 0; index < str.Length; index++)
{
    if (!foundStart)
    {
        if (str[index].Equals('<') && str.Substring(index, 4).Equals("<div"))
        {
            foundStart = true;                  
            continue;
        }
    }

    if (foundStart && !foundEnd)
    {
        if (str[index].Equals('>'))
        {
            foundEnd = true;
            continue;
        }
    }

    if (foundStart && foundEnd)
    {   
        if (startIndex == -1)
            startIndex = index;
        else
            length++;
            
        if (str[index + 1].Equals('<'))
        {
            foundStart = false;
            foundEnd = false;
            break;
        }
    }
}

//// This is your answer
Console.WriteLine(str.Substring(startIndex, length));
Kaveesh
  • 388
  • 6
  • 14
  • The OP has some huge bit of HTML, probably with many `div` elements. He's specifically looking for this one `div`. You're getting any `div`. – Enigmativity Sep 06 '20 at 11:17
  • Enigmativity . yes. it's right. thank you. i have many div in html body but need to take out only that. – Replica Team Sep 06 '20 at 13:29