0

I am making an app in which I want to fetch some content from iframe, In iFrame code is like this.

<div class="abc">

<a class="abc" href="example.com" data-ctorig="example2.com" > **** </a>

</div>

I want the website "example2.com" for my app. I use ASP.NET(C#), HtmlAgilityPack.... How to do this?

I go to the "src" link of the iframe. But again I found nothing. I give you the link click here. Open this page, I want to parse given links. 1. How to Find True North Without a Compass: 2. How to Find True North Without a Compass.

Sagar Kadam
  • 31
  • 1
  • 4

2 Answers2

1

Point HtmlAgilityPack to the IFrame URL, not the Hosting page.

Clarified...

If I understand you correctly, you can fetch the HTML of the IFrame using a WebClient and HtmlAgilityPack.

First you need to use a WebClient to fetch the HTML of the host page. You'll then want to use HtmlAgilityPack to parse the host page HTML and extract the IFrame URL. Next you'll want to use another WebClient to get the HTML from the IFrame URL, and again, use HtmlAgilityPack to parse the response, which should give you what you're after.

Of course, your question is very vague, so I'm not entirely sure this is what you're after. Either way, the following links should help you.

HtmlAgilityPack Tutorial

Download HTML Using WebClient

Community
  • 1
  • 1
ctorx
  • 6,841
  • 8
  • 39
  • 53
  • Use a WebRequest to get the HTML from the host page, then parse it with HtmlAgilityPack. Use HtmlAgility pack to extract the iFrame from the HTML. Use the src attribute of the fetched IFrame to perform another Web Request, and parse the response with HtmlAgilityPack. – ctorx Jan 25 '12 at 21:49
  • Hi! @Matthew, Can you give any example. I tried to extract the iframe from the HTML using HtmlAgilityPack. But i failed, Please give any example. – Sagar Kadam Jan 29 '12 at 19:19
  • Aside from actually writing the code for you, I'd start by looking here: http://stackoverflow.com/questions/2422762/html-agility-pack. You're going to need to get familiar with how to take an HTML chunk, parse it with HtmlAgilityPack, and extract the information you want using the HtmlAgilityPack API. SO and Google should be more than adequate in helping you get there now that the high level solution hath been provided. – ctorx Jan 30 '12 at 20:53
  • Hey! @ctorx, I am using only HtmlAgilityPack(HAP), I fetch the html from host page using HAP, but when i go to the source code of the host page, there is no iframe. So, what I do. – Sagar Kadam Feb 01 '12 at 19:26
  • It is possible that the host page is injecting the iframe into DOM using JavaScript, in which case HtmlAgilityPack isn't going to help you. If that is the case you're going to need something than can parse JavaScript and maintains a DOM that you can query. – ctorx Feb 05 '12 at 06:55
0

Assuming you're talking about doing this from a page served and rendered in a client browser, you would need to do so in JavaScript, not C#. The iframe is rendered on the client browser and so your server side code would have no access to it.

CodingGorilla
  • 19,612
  • 4
  • 45
  • 65
  • Maybe they’re screen scraping? – Douglas Jan 25 '12 at 19:52
  • Even if they're screen scraping, at a minimum it would require some javascript that scrapes the iframe content and posts it back to the server. But his question was how to get the content, not how to parse it. He can't get it using C#. – CodingGorilla Jan 25 '12 at 19:54
  • Maybe _they_ are the client; they need to parse the content of a third-party’s page. – Douglas Jan 25 '12 at 19:55
  • It doesn't matter *who* they are, if this is happening inside of a browser (as evidenced by his comment that his app is "example2.com") it still has to be done in javascript, not C#. Server side asp.net code has no idea what content, if any, is ever loaded into an iframe. That's completely the function of the browser. – CodingGorilla Jan 25 '12 at 19:59
  • I think they said that "example2.com" is the string they want to extract from the page markup. I assume they are using `WebClient` within a C# client app to download pages from a third-party site, and need to parse the downloaded content. – Douglas Jan 25 '12 at 20:19
  • Sorry, I forgot to mention that I don't want to use javascript. My app runs on the server not on the browser. And javascript needs browser to run. – Sagar Kadam Jan 25 '12 at 20:20