-3

I would like to create a JavaScript code that would grab this data from another website and put it all into a .TXT file, Maybe even if it can be converted to a XML file would be even better if possible.?

If not JavaScript anything else will be fine.

Grabbing Photo

I wish to grab the price and the item name and I'm not completely sure on how to do that.

Website is http://www.bigw.com.au/electronics/computers-office/computer-accessories/webcams if you need to read their source to help.

Community
  • 1
  • 1
Terrii
  • 385
  • 4
  • 8
  • 23
  • possible duplicate of [Options for HTML scraping?](http://stackoverflow.com/questions/2861/options-for-html-scraping) – Michael Petrotta Apr 20 '13 at 10:08
  • 1
    try with saving the page – PSR Apr 20 '13 at 10:09
  • I don't get why you sent that link and i would but there is alot of those pages and i want to automatically grab the data. – Terrii Apr 20 '13 at 10:12
  • Terrii, it's standard practice on this site to consolidate questions. The post I linked to provides help for the portion of your question that is on-topic here. – Michael Petrotta Apr 20 '13 at 10:14
  • you cant do it with clientside javascript, if you want to do it in javascript you have to use node.js or phantomjs and do it on the server. – supernova Apr 20 '13 at 10:16
  • why not simply use the browser, why so difficult. I thought html was meant to build apps regardless of OS (at least that is what was promised over a decade ago...) :P – GitaarLAB Apr 20 '13 at 10:21

2 Answers2

2

Rip a website client-side with a browser and javascript? No problem.

yahoo yql... (instead of a php? proxy serverside script)..

I have a sneaky suspicion you do not own/control the external link site, so getting content from a different site, would fall under cross-domain security restrictions (to a modern browser).

So in order to regain 'power to the user', just use http://query.yahooapis.com/.

EXAMPLE 1:
Using the SQL-like command:

select * from html 
where url="http://stackoverflow.com" 
and xpath='//div/h3/a'

The following link will scrape SO for the newest questions (bypassing cross-domain security bull$#!7):
http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20%0Awhere%20url%3D%22http%3A%2F%2Fstackoverflow.com%22%20%0Aand%20xpath%3D'%2F%2Fdiv%2Fh3%2Fa'%3B&format=json&callback=cbfunc

As you can see this will return a JSON array (one can also choose xml) and calling the callback-function: cbfunc.

Indeed, as a 'bonus' you also save a kitten every time you did not need to regex data out of 'tag-soup' and you don't need to mess with lord Cthulu.

Do you hear your little mad scientist inside yourself starting to giggle?

Then see this answer for more info (and don't forget it's comments for more examples).

Once you have the data, you can always ajax it back to your server, so repeating this 1000 times is no problem (as long as there is space on your server).

Good Luck!

Community
  • 1
  • 1
GitaarLAB
  • 14,536
  • 11
  • 60
  • 80
  • ok so i tryed that and have a look at this – Terrii Apr 20 '13 at 11:14
  • http://jsfiddle.net/7tC3L/11/ it doubles the photo? why? – Terrii Apr 20 '13 at 11:14
  • 1
    Might I suggest to `fork` a fiddle from another question's example? Also you can hit `run` to test changes instead of adding version's through `update`.     As far as I can tell you only changed the url in the fiddle, that is not enough, you'd need to make your own function/query in(stead of) that 'fetchEbayStore' function in my fiddle. Look at the yql documentation! – GitaarLAB Apr 20 '13 at 11:27
  • yeah thanks and thats all i did and i changed the xpath in fetchebayfunction but it still multiplys anyway thanks for all your help :) – Terrii Apr 20 '13 at 11:39
  • i figured out why there is two pictures in the one div :/ – Terrii Apr 20 '13 at 11:41
0

you can get the source code of page by saving the page

or you can use

   Right click on webpage ->view source
PSR
  • 39,804
  • 41
  • 111
  • 151
  • yes i know that but there are over 1000 pages that i will need to save and i don't want the whole page just those too bits of data automatically saved into a txt file – Terrii Apr 20 '13 at 10:13
  • see here http://howto.cnet.com/8301-11310_39-20111396-285/five-ways-to-save-a-web-page/.This might help you.But i am not sure – PSR Apr 20 '13 at 10:14