11

I am working on a project to create a job search portal like www.jobseeker.com.au.

If you search for 'woolworths' you get a list of jobs available in Woolworths. However, when you visit the link, it seems like every request is completed via hidden iframe containing chunks of data which is parsed using javascript to display the content.

You can see list of all jobs here: https://woolworths.taleo.net/careersection/10060/joblist.ftl

My question is that if all task is done by parsing data chunks in the hidden iframe, how come a site like www.jobseeker.com.au is able to get list of all jobs from that page? Is there any hidden URL which returns list of all jobs which can then be simply scrapped?

Dante May Code
  • 11,177
  • 9
  • 49
  • 81
Amit Kumar
  • 168
  • 1
  • 7
  • Amit, did you ever find out the answer to this question? I'm quite interested in this myself ! – Antonio2011a Apr 16 '12 at 03:48
  • Hello Antonio. I haven't found the solution yet. It is really hard to understand how they are processing their data chunks. – Amit Kumar Apr 17 '12 at 06:39
  • 4
    Hi, Amit. Have you found any solution? :) I think this question is applicable to all *.taleo.net sites, so do not understand why it is closed as too localized... – LA_ Mar 11 '14 at 11:36
  • I doubt there's any one hidden URL for every Taleo site. Some have RSS turned on, which you could parse. The big job boards may have permission to use [Taleo's API](http://www.oracle.com/technetwork/fusion-apps/tcwsfp12a-userguide-enus-1648971.pdf). For what's left, scraping is an option, though you have to deal with the iframe, the XHR, cookies, and about 200 POST fields Taleo uses for session tracking. I haven't found any FOSS solutions to this. – duozmo Apr 20 '14 at 23:18

1 Answers1

3

just change org= and you will get the post in rss.

ch.tbe.taleo.net/CH02/ats/servlet/Rss?org=CALYPSO&cws=1&WebVersion=0&_rss_version=2
Vikdor
  • 23,934
  • 10
  • 61
  • 84
wozza.xing
  • 39
  • 2
  • 2
    This URL now redirects to a login page :( – Akshay Raje Sep 02 '15 at 19:12
  • This answer works for some versions of Taleo. If the site uses FTL format like joblist.ftl, append "feed/joblist.rss?lang=en&portal=101430233&searchtype=3&f=null&s=3|D&a=null&multiline=false" after a domain like "ircareers.taleo.net/careersection" where "ircareers" and portal ID are unique values that need to be swapped out. Tested by importing customized feed directly into a feed reader like Feedly. – Adam Jul 11 '19 at 13:00
  • Alternatively, if "portal" value is not available in page source code, look for "csarray" – Adam Jul 11 '19 at 13:43