I am trying to scrape some data off an enterprise website that was made using JSF and IceFaces. I am using C# and the RestSharp library.
I have no experience with JSP , JSF or IceFaces at all, so I am just trying to figure out how to replicate what the site is doing using HTTP requests but I wasn't very successful. The site does not have any concept of routing whatsoever (and when you accidentally happen to press the back button in the browser, you are logged out...).
What I have managed to do so far:
- Make a POST request with credentials to
/login
resource in order to log in - Retrieve JSESSIONID Cookie after login and store it to my
CookieContainer
- Use Regexes to get the
ice.session
andice.view
values - Replicate a POST request to the
block/send-receive-updates
If the original POST request is managed by the JS code on the site (When I am clicking around), it returns an XML response like this:
<updates>
<update address="some form id" tag="table"> ... </update>
<update address="content" tag="div"> ... </update>
<update address="The ice.session id, the ice.view number separated with : followed by the string 'dynamic-code'" tag="script"> ... </update>
</updates>
However, I took all the encoded POST params that this request is doing on the site and replicated them in my C# code and my response only has the last update (script tag) like this:
<updates>
<update address="The ice.session id, the ice.view number separated with : followed by the string 'dynamic-code'" tag="script"> ... </update>
</updates>
Does anyone please have experience with scraping/testing these technologies and can help me figure out what am I doing wrong ?
Thanks.