0

I am trying to make a program that can search for elements in a webpage using its source code. (i.e, first I will enter a webpage address to the application. suppose I enter www.google.com in the application. Then I will search for type="text" and then the program should search source code of google.com and show me the number of type="text" elements found in source code of Google).

I am using an external tool provided by iWEBTool to get the source code. The code is:

<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
    <title>My App</title>
    <link rel="icon" type="image/x-icon" href="favicon.ico" />

    <style type="text/css">
        #PageSource {
            height: 367px;
            width: 62%;
        }
        .auto-style1 {
            width: 287px;
        }
        .auto-style2 {
            width: 389px;
        }
    </style>
</head>
<body>

  <h1 style="text-align:center">My App</h1>
<form method="get" name="pageform" action="http://www.iwebtool.com/tool/tools/code_viewer/code_viewer.php"  target="pageframe" onsubmit="return validate(this);"><table border="0" style="border-collapse: collapse" width="100%"><tr>
<td width="956" height="91" valign="top"><br />
<table style="border-collapse: collapse" width="100%" class="tooltop" height="76"><tr>
<td><br />
<table border="0" style="border-collapse: collapse" width="100%" cellspacing="5"><tr>
<td height="28" class="auto-style1">Enter the website address :</td>
<td height="28" class="auto-style2"><br />
<font size="1">http://</font><input type="text" name="domain" size="26"></td>
<td height="28" width="391"><br />
<input type="submit" value="View!" style="float: left"></td>
</tr>
<tr>
<td height="21" class="auto-style1">&nbsp;</td>
<td width="691" colspan="2" height="21" align="top"><font size="1"></font></td>
</tr>
</table></td>
</tr>
</table></td>
</tr>
<tr>
<td width="956"><br />
<iframe name="pageframe" class="toolbot" frameborder="0" id="PageSource"><br />
</iframe></td>
</tr>
</table>
</form>
    <script language="JavaScript">
                   function validate(theform) {
                       if (theform.domain.value == "") { alert("No Domain"); return false; }
                       return true;
                   }

<br />
</body>
</html>

Now, the source code gets successfully displayed in the web app and it is using an iframe to show the source code. I now want to find a way to search for tags inside the source code which is contained in iframe.

Should I load the content inside the iframe to a div to make the searching possible? Or is there any other way to accomplish searching?

Please help!

Ikhlak S.
  • 8,578
  • 10
  • 57
  • 77
Phoenix.1993
  • 85
  • 1
  • 2
  • 13
  • if the iframe domain is the same as your application domain its possible, else you somehow have to get it into your domain only then you can read the DOM – Rajshekar Reddy Apr 13 '16 at 11:46
  • Any suggestions how to do that? – Phoenix.1993 Apr 13 '16 at 11:49
  • It means that you will only be able to access elements in an iframe that's in your own domain, so if the iframe *src* attribute is not in your same domain it will fail to be "queried" because it would be a cross-domain request. – Jesus Gonzalez Apr 13 '16 at 11:53
  • I am using an online tool to load the source of a webpage in another domain to my webpage. Does that mean that the iframe content is in domain of the iWebTool? – Phoenix.1993 Apr 13 '16 at 11:57
  • You said you are using a tool to get the contents right? so just append the content into some div and then you can use jquery – Rajshekar Reddy Apr 13 '16 at 11:57
  • can you write the code to append it to a div? sorry but im in learning phase and needs example to understand. This is just a demo project I'm working on to learn more. – Phoenix.1993 Apr 13 '16 at 11:59
  • There are libraries like HTMLAgility pack which can get you the HTML contents of any site into your C# code.. then you can use xpath to play with the DOM... – Rajshekar Reddy Apr 13 '16 at 12:00
  • @MidhunT `can you write the code to append it to a div?` I can write a code to append to div.. but what to append?? where do I get the contents from?? if you can show that code may be we can continue from there – Rajshekar Reddy Apr 13 '16 at 12:00
  • @Reddy - I have pasted my whole code above. – Phoenix.1993 Apr 13 '16 at 12:02
  • @MidhunT in your code what happens after you submit the form... – Rajshekar Reddy Apr 13 '16 at 12:36
  • @Reddy - after i submit the form, an iframe object appears in the same page itself with the source code of the website inside it. – Phoenix.1993 Apr 14 '16 at 05:06
  • Ok But what is the Source of the iframe.. Can you check and tell? – Rajshekar Reddy Apr 14 '16 at 06:25
  • @MidhunT check the source of the iframe using jquery you can get it... see if it is same as your application domain name.. – Rajshekar Reddy Apr 14 '16 at 06:32
  • @Reddy- When I rightclick on the loaded page and click on 'load iframe source', I get the following link : "view-source:http://www.iwebtool.com/tool/tools/code_viewer/code_viewer.php?domain=www.google.com" – Phoenix.1993 Apr 15 '16 at 08:09

1 Answers1

0

Consider loading the content inside a div as you suggest. As far as I know, you cannot "query" elements in an iframe if I'm correct.

jQuery/JavaScript: accessing contents of an iframe

Get HTML inside iframe using jQuery

How to access the content of an iframe with jQuery?

-

As requested by the question author:

I would create a proxy PHP/Python/Ruby/C#/whatever script in order to fetch the pages from my own domain:

Note: I did not try this myself yet, since it's not possible for me, but hey, you could try this in a second. Consider changing the tables for div structures as well. In HTML code, you should replace the /path/to/my-script.php string with the path to your PHP script.

Security notice: The following code is potentially dangerous, be careful.

Sample PHP:

<?php

echo file_get_contents($_POST["url"]);

Then I would just AJAX my own service:

<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
  <title>My App</title>
  <link rel="icon" type="image/x-icon" href="favicon.ico" />

  <style type="text/css">
    #PageSource {
      height: 367px;
      width: 62%;
    }
    .auto-style1 {
      width: 287px;
    }
    .auto-style2 {
      width: 389px;
    }
  </style>
</head>
<body>

  <h1 style="text-align:center">My App</h1>
  <form method="post" id="pageform" name="pageform" action="/path/to/my-script.php" ><table border="0" style="border-collapse: collapse" width="100%"><tr>
    <td width="956" height="91" valign="top"><br />
      <table style="border-collapse: collapse" width="100%" class="tooltop" height="76"><tr>
        <td><br />
          <table border="0" style="border-collapse: collapse" width="100%" cellspacing="5"><tr>
            <td height="28" class="auto-style1">Enter the website address :</td>
            <td height="28" class="auto-style2"><br />
              <font size="1">http://</font><input type="text" name="domain" size="26"></td>
              <td height="28" width="391"><br />
                <input type="submit" value="View!" style="float: left"></td>
              </tr>
              <tr>
                <td height="21" class="auto-style1">&nbsp;</td>
                <td width="691" colspan="2" height="21" align="top"><font size="1"></font></td>
              </tr>
            </table></td>
          </tr>
        </table></td>
      </tr>
      <tr>
        <td width="956"><br />
          <div style="width:956px;" id="loaded-div"></div>
        </tr>
      </table>
    </form>

    <script src="https://code.jquery.com/jquery-1.12.0.min.js"></script>
    <script src="https://code.jquery.com/jquery-migrate-1.2.1.min.js"></script>
    <script>
      function validate(theform) {
       if (theform.domain.value == "") { alert("No Domain"); return false; }
       return true;
     }

     $('#pageform').on('submit', function(){
      if (!validate(this))
        return false;


        $.ajax({
          url: this.action,
          type: this.method,
          data: { url: this.domain.value },
        })
        .done(function(data) {
          console.log("It worked, so we load the data into the div:");

          $('#loaded-div').html(data);

          // If I'm correct, you should now be able to access the elements in the div: eg. access textboxes in the document
          var $textbox = $('#loaded-div').find('input[type="text"]');

          // doSomethingWithText($textbox.val());
          // etc
        })
        .fail(function(data) {
          console.log(data);
          console.log("Something wrong happened. Alert user.");
        });

        return false;
    });
  </script>
</body>
</html>
Community
  • 1
  • 1
Jesus Gonzalez
  • 411
  • 6
  • 17
  • Can you help me with the code to load the content in the iframe in to a div? – Phoenix.1993 Apr 13 '16 at 11:53
  • I will be able to post the code in about 1-2 hours. I am busy right now. – Jesus Gonzalez Apr 13 '16 at 12:07
  • Updated, give it a try. – Jesus Gonzalez Apr 13 '16 at 12:53
  • Thank You so much for your help. I was busy with something else. I will try it out today and will let you know the result... Tnx again – Phoenix.1993 Apr 14 '16 at 05:02
  • @MidhunT, be careful, do not post this into a production site, as the PHP script will just fetch contents from any resource, such as your local files in your system. I have also updated the HTML document, adding an ID attribute to the form. – Jesus Gonzalez Apr 14 '16 at 05:18
  • Okay sir, tnx again. Ill update the status as soon as i get time to give it a shot :) – Phoenix.1993 Apr 14 '16 at 05:25
  • Renember to mark the answer as correct if it solves your problem. Thank you. – Jesus Gonzalez Apr 14 '16 at 08:35
  • Sure, I will. tnx for your patience. I am travelling and I dont have my machine with me now to check. sorry and tnx again – Phoenix.1993 Apr 14 '16 at 09:12
  • Sorry for the late update. I tried running the application with your code and I changed the path pointing to my script file. But while clicking the submit button, nothing is happening. So I went to inspect window and found some errors. They are : Failed to load resource: the server responded with a status of 404 (Not Found) jquery-1.12.0.min.js:4 XMLHttpRequest cannot load file:///C:/Users/admin/Desktop/Script.php. Cross origin requests are only supported for protocol schemes: http, data, chrome, chrome-extension, https, chrome-extension-resource. – Phoenix.1993 Apr 15 '16 at 08:04
  • I have updated the JS in order to log the data returned by the error. Post it here for eg. – Jesus Gonzalez Apr 15 '16 at 08:35