0

Ok. So I found this code online everything is working on it but it shows me the div class I am searching for but removes all the text. Any idea why? Heres a example of what its outputting...

<div class="marquee"><img src="logo.png" /></div>
<div id="joke">
    <div id="setup" class="exit-left"></div>

    <div id="punchline">
        <div class="question"></div>
        <div id="zing" class="exit-right"></div>
    </div>
</div>
<div id="buttons">
    <input id="tell-me" class="exit-bottom no-select" type="button" value="Tell Me!" />
    <!--<input id="another" class="exit-bottom" type="button" value="Another!" />-->
    <table class="another exit-bottom no-select">
        <tr>
            <td class="another" colspan="3">Another</td>
            <td class="share"><div class="share-img"></div>Share</td>
        </tr>
    </table>
</div>  

And the innertext is not shown at all... And here is my code is VS.

var doc = new HtmlAgilityPack.HtmlDocument(); HtmlAgilityPack.HtmlNode.ElementsFlags["br"] = HtmlAgilityPack.HtmlElementFlag.Empty; doc.OptionWriteEmptyNodes = true;

                 try
                 {
                     var webRequest = HttpWebRequest.Create("http://dadjokegenerator.com/");
                     Stream stream = webRequest.GetResponse().GetResponseStream();
                     doc.Load(stream);
                     stream.Close();
                 }
                 catch (System.UriFormatException uex)
                 {

                     throw;
                 }
                 catch (System.Net.WebException wex)
                 {

                     throw;
                 }

                 //get the div by id and then get the inner text
                 doc.GetElementbyId("content").InnerHtml;
                 await e.Channel.SendMessage("test " + divString); `
  • Also, I want to make the "test " clear so nobody asks about it. All im using it for right now is testing because the program im outputting in will crash the program if it trys to send a message with no text. – LogandadLoga Dec 10 '16 at 02:07
  • Can you change the html example to show the div with id='content'? Include some parents, hopefully from the tag – Visual Micro Dec 10 '16 at 02:12
  • That is with the id='content'. – LogandadLoga Dec 10 '16 at 02:17
  • Oh, I see your example is broken. You can edit it so the code is correct and can be read more easily? Are you creating your HtmlDocument from the result of the WebRequest? or is it an empty document?? Please be clearer if possible. – Visual Micro Dec 10 '16 at 02:22
  • Also you can explore the site yourself and look at its divs. Im trying to get the joke and put it in a string. The site is http://dadjokegenerator.com/ – LogandadLoga Dec 10 '16 at 02:28
  • I changed the answer. Sorry originally I thought your code was just a snippet/example but you just need to make sure you are getting the html page properly. – Visual Micro Dec 10 '16 at 02:58

2 Answers2

1

Although your code correctly downloads content of page http://dadjokegenerator.com/, InnerHtml is empty, because this page actually doesn't contain joke you are looking for (you can see that, if you display source code of page in you web browser - e.g. in Firefox press CTRL+U). Joke is added to this page later by javascript. If you look at source code of this Javascript at http://dadjokegenerator.com/js/main.js, you can see that individual jokes are downloaded from URL http://dadjokegenerator.com/api/api.php?a=j&lt=r&vj=0

Here is minimal sample to download joke from this URL. I ommited all error-checks for simplicity and I used free Json.NET library for JSON deserialization:

public class Joke
{
    public int Id;
    public string Setup;
    public string Punchline;

    public override string ToString()
    {
        return Setup + " " + Punchline;
    }
}

public static Joke GetJoke()
{
    var request = HttpWebRequest.Create("http://dadjokegenerator.com/api/api.php?a=j&lt=r&vj=0");
    using (var response = request.GetResponse())
    {
        using (var stream = response.GetResponseStream())
        {
            using (var reader = new StreamReader(stream))
            {
                var jokeString = reader.ReadToEnd();

                Joke[] jokes = JsonConvert.DeserializeObject<Joke[]>(jokeString);
                return jokes.FirstOrDefault();
            }
        }
    }
}

Usage is e.g.

GetJoke().ToString();
Ňuf
  • 6,027
  • 2
  • 23
  • 26
0

These links show how to read a web page.

Html Agility Pack. Load and scrape webpage

Get HTML code from website in C#

Community
  • 1
  • 1
Visual Micro
  • 1,561
  • 13
  • 26
  • Thats not what I need thats a simpler version. Everything is working the only problem is when I search for any of the htmls the text in the divs wont show at all. – LogandadLoga Dec 10 '16 at 02:22