2

First of all: Sorry for my bad English!

I know the title isn't the best English, but I don't really know how to format this question...
What I'm trying to do is reading an HTML source line by line so when it sees a given word (like http://) it copies the entire sentence so I can strip the rest an only keep the URL.

This is what I've tried:

using (var source = new StreamReader(TempFile))
{
    string line;
    while ((line = source.ReadLine()) != null)
    {
        if (line.Contains("http://"))
        {
            Console.WriteLine(line);
        }
    }
}

This works perfectly if I want to read it from an external file but it doesn't work when I want to read an string or stringbuilder, how do you read those line by line?

puretppc
  • 3,232
  • 8
  • 38
  • 65
Yuki Kutsuya
  • 3,968
  • 11
  • 46
  • 63

5 Answers5

7

You can use new StringReader(theString) to do that with a string, but I question your overall strategy. That would be better done with a tool like HTML Agility Pack.

For example, here is HTML Agility Pack extracting all hyperlinks:

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(theString);
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href]")
{
   HtmlAttribute att = link["href"];
   Console.WriteLine(att.Value);
}
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
0

Well a string is just a string, it doesn't have any lines.

You can use something like String.Split to separate on the \r symbol.

MSDN: String.Split()

string words = "This is a list of words, with: a bit of punctuation" +
                       "\rand a newline character.";

string [] split = words.Split(new Char [] {'\r' });

foreach (string s in split) {
    if (s.Trim() != "")       
        Console.WriteLine(s);
}
Only Bolivian Here
  • 35,719
  • 63
  • 161
  • 257
0

Firstly, you can use a StringReader.

Another option is to create a MemoryStream from the string via converting the string to a byte array first, as described in https://stackoverflow.com/a/10380166/396583

Community
  • 1
  • 1
vines
  • 5,160
  • 1
  • 27
  • 49
0

I think you can tokenize the input and check each entry for the required content.

 string[] info = myStringBuilder.toString().split[' '];
 foreach(var item in info) {
 if(item.Contains('http://') {
    //work with it
    }
 }
vishakvkt
  • 864
  • 6
  • 7
0

You can use a memory stream to read from.

SargeATM
  • 2,483
  • 14
  • 24