0

I am trying to find a high-level Clojure library for making HTTP and HTTPS requests, parsing out forms and links from responses and then POST-ing updated forms or following links. Ideally something that would automatically handle redirects and cookies (i.e. sessions). That is, I'd like to find something whereby my code can as closely as possible mimic a user driving a webapp from a browser, without the browser.

A number of years ago we used Hpricot and Ruby for a similar task but I'm prefer to do this in Clojure if at all possible. From memory - and I haven't used Hpricot for years - we were able to do all this with minimal effort: we were able to concentrate on the 'what' of driving the application, not the 'how'.

I found clj-http https://github.com/dakrone/clj-http but this seems to be one step lower-level than I'm looking for (no form parsing) - although it is based on Apache HttpComponents http://hc.apache.org/httpcomponents-client-ga/ which does seem to expose a nice, fluent, API for forms http://hc.apache.org/httpcomponents-client-ga/tutorial/html/fluent.html.

Screen scraping in clojure asks about screen-scraping in Clojure, and there are several good suggestions for that, but nothing that really addresses the above.

HTTP Kit http://www.http-kit.org/client.html looks like it would be a great foundation for the above but doesn't do form parsing or session management (as far as I can see).

Currently I've veering toward using the Apache HttpComponents Java library directly from Clojure. Can anyone suggest any better - perhaps more Clojure idiomatic - alternative? Or anything that they found worked well in similar circumstances? My goal is to write the minimal amount of code quickly to investigate a problem with a web service. This is not production code. Saving time, rather than getting an 'ideal' solution is my main concern.

[The background is that I am trying to mimic certain forms of user behaviour in order to first reproduce and then try and track down an intermitent bug in a large body of legacy Java/EJB code. However the problem only seems to occur one time per several thousand POSTs. (The suspicious is of some form of caching issue.) The existence of the problem, after the fact, is easy to detect however.]

Community
  • 1
  • 1
Paul
  • 3,009
  • 16
  • 33
  • Have you considered selenium? There is a clojure wrapper for selenium : https://github.com/semperos/clj-webdriver – Viktor K. Sep 18 '15 at 10:10
  • I've been reading the Javadoc for Selenium. I don't see any way to pull out the form(s) in a request, update them and submit the change. – Paul Sep 18 '15 at 13:14

1 Answers1

0

Have you looked at the Enlive library yet? Here is a good tutorial on it.

You seem to really have 2 parts here. The first part is (1) a Selenium-like client, which drives (2) a webserver.

For part (1), either Selenium, Enlive, or something similar will allow you to simulate a browser to submit data, read the responses, and respond from there. For part (2), it seems you just need a regular Clojure web framework such as Ring/Compojure (older & simpler) or Pedestal (newer & more powerful).

Alan Thompson
  • 29,276
  • 6
  • 41
  • 48
  • For 1) Enlive looked promising. But what I ideally wanted was something very high-level. I want to be able to do things like: make a get request, 'extract' the form (or forms) from the response, update some of the fields, and get the next response as it I had (for example) clicked the button labelled 'Next'. BTW 2) Is an existing legacy Java/EJB/JBoss system with a sporadic issue we are trying to track down. Selenium looked useful - but I really want my code to directly act as the client (not via driving a browser) as I expected I might have to later run multiple 'clients' from my code. – Paul Sep 21 '15 at 11:14