1

This all goes back to some of my original questions of trying to "index" a webpage. I was originally trying to do it specifically in java but now I'm opening it up to any language.

Before I tried using HTML unit and other methods in java to get the information I needed but wasn't successful.

The information I need to get from a webpage I can very easily find with firebug and I was wondering if there was anyway to duplicate what firebug was doing specifically for my needs. When I open up firebug I go to the NET tab, then to the XHR tab and it shows a constantly updating page with the information the server is updating. Then when I click on the request and look at the response it has the information I need, and this is all without ever refreshing the webpage which is what I am trying to do(not to mention the variables it is outputting do not show up in the html of the webpage)

So can anyone point me in the right direction of how they would go about this? (I will be putting this information into a mysql database which is why i added it as a tag, still dont know what language would be best to use though)

Edit: These requests on the server are somewhat random and although it shows the url that they come from when I try to visit the url in firefox it comes up trying to open something called application/jos

Jon Storm
  • 185
  • 1
  • 3
  • 10

3 Answers3

3

Jon, I am fairly certain that you are confusing several technologies here, and the simple answer is that it doesn't work like that. Firebug works specifically because it runs as part of the browser, and (as far as I am aware) runs under a more permissive set of instructions than a JavaScript script embedded in a page.

JavaScript is, for the record, different from Java.

If you are trying to log AJAX calls, your best bet is for the serverside application to log the invoking IP, useragent, cookies, and complete URI to your database on receipt. It will be far better than any clientside solution.

On a note more related to your question, it is not good practice to assume that everyone has read other questions you have posted. Generally speaking, "we" have not. "We" is in quotes because, well, you know. :) It also wouldn't hurt for you to go back and accept a few answers to questions you've asked.

Winfield Trail
  • 5,535
  • 2
  • 27
  • 43
  • Yes I'm aware JavaScript is different then Java, when did I mention JavaScript in this? And I was aware that firebug runs as part of the browser which is why i thought my best bet would be to try and emulate a browser which i attempted to do with java through both HTML unit and Selenium. However I didnt really have success with either which is why im taking a step back and asking what others would do. But thank you for an answer – Jon Storm Jul 07 '11 at 03:41
  • In the context, and in the context of your other questions, I was concerned there might be some confusion. – Winfield Trail Jul 07 '11 at 03:45
  • I understand no one probably has read my other posts but I included it just incase someone who has thinks I am asking the same question again(which I dont believe I am) That was more of just short intro to where this question was coming from. ^^@ the above response Well thanks again for trying to help. As you can probably tell I am a beginning programmer and am probably trying to do some things that are over my head or level or however you would like to put it – Jon Storm Jul 07 '11 at 03:47
1

If you're using a library such as jQuery, you may have an option such as the jQuery ajaxSend and ajaxComplete callbacks. These could post requests to your server to log these events (being careful not to end up in an infinite loop).

Michael Mior
  • 28,107
  • 9
  • 89
  • 113
  • You could potentially do this as a browser plugin. Or you could use a bookmarklet that injects some code into the page you're viewing. It would require that the page is using a library which supports such callbacks however, as there's no way to automatically intercept AJAX requests. Another completely different option would be to monitor network traffic leaving the machine in question for AJAX requests and responses, although this is significantly more complicated. – Michael Mior Jul 07 '11 at 04:00
1

So, the problem is?:

  1. With someone else's web-page, hosted on someone else's server, you want to extract select information?
  2. Using cURL, Python, Java, etc. is too painful because the data is continually updating via AJAX (requires a JS interpreter)?
  3. Plain jQuery or iFrame intercepts will not work because of XSS security.
  4. Ditto, a bookmarklet -- which has the added disadvantage of needing to be manually triggered every time.


If that's all correct, then there are 3 other approaches:

  1. Develop a browser plugin... More difficult, but has the power to do everything in one package.

  2. Develop a userscript. This is much easier to do and technologies such as Greasemonkey deal with the XSS problem.

  3. Use a browser macro technology such as Chickenfoot. These all have plusses and minuses -- which I won't get into.

Using Greasemonkey:
Depending on the site, this can be quite easy.   The big drawback, if you want to record data, is that you need your own web-server and web-application. But this server can be locally hosted on an XAMPP stack, or whatever web-application technology you're comfortable with.

Sample code that intercepts a page's AJAX data is at: Using Greasemonkey and jQuery to intercept JSON/AJAX data from a page, and process it.

Note that if the target page does NOT use jQuery, the library in use (if any) usually has similar intercept capabilities. Or, listening for DOMSubtreeModified always works, too.

Community
  • 1
  • 1
Brock Adams
  • 90,639
  • 22
  • 233
  • 295
  • I actually already am working with XAMPP for a local mysql database. But thanks for the answer and I will look more into how your suggesting. – Jon Storm Jul 07 '11 at 13:12
  • Can i use a bookmarklet to simulate jQuery on the page? Like so? http://blog.reybango.com/2010/09/02/how-to-easily-inject-jquery-into-any-web-page/ Then do the exact same thing that you suggested? – Jon Storm Jul 07 '11 at 21:35
  • You could, but it would be a pain, because you'd have to click it with every load, and THEN start whatever additional script.. Fortunately Greasemonkey can do anything a bookmarklet can do -- and do it automatically. – Brock Adams Jul 07 '11 at 21:42
  • Is there anyway you can help me out a little further? Id be willing to pay if you could write the portion that I need(probably only take 15-20 minutes of your time)? It would be mainly for learning purposes and I would try to edit it further but I honestly have no idea where to start... If your interested you can contact me on msn at stormpsu10000@yahoo.com or if you have a preferred method of contact Id be willing to use whatever. Thanks – Jon Storm Jul 08 '11 at 02:31
  • Open another question and: (1) Clearly restate what you are trying to do, using as much relevant detail as possible. Before-and-After code/pics work best.   (2) Link to the actual target page. If that is not possible, save the page's source to pastebin.com and link to that.   (3) Appease the moderators by showing what you have done so far (code snippets). ... If you do all that, and I have enough details to work with, then I will give the issue a reasonable amount of my StackExchange time. You can also sometimes get help writing GM scripts at [userscripts.org](http://userscripts.org/forums/2). – Brock Adams Jul 08 '11 at 02:48
  • Yeah thats probably what I should have done to begin with however I didnt really want people to know what I was trying to do which is why I have been so vague in all of my questions because I didnt want to give it away. Ill try to learn how to make GM scripts first because thats the part that I have no idea where to start with. I think the links you provided me should be useful I just didnt spend very much time learning what they do and just tried to throw them into a script. Thanks for your advice/help though. – Jon Storm Jul 08 '11 at 02:53