0

I am trying to retrieve data from one site to be able to apply some models on them. And although, I was able to login with jsoup. It didn't get the content because it was loaded as JSON through AJAX.

Using firefox (Ubuntu) i got this curl after inspecting the xhr:

curl 'https://target.helpshift.com/xhr/view/issue-details/?publish_id=540672&viewing=1' -H 'Host: target.helpshift.com' -H 'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:41.0) Gecko/20100101 Firefox/41.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'X-Requested-With: XMLHttpRequest' -H 'Referer: https://target.helpshift.com/admin/issue/540672/' -H 'Cookie: _ga=GA1.3.1855716730.1445436566; _csrf_token=4mMuJX5jieMAdq_WN1elKiBh0415w-0TDxN_R6kx6SQ; _dc_gtm_UA-33692972-1=1; __hs=zHb6Mr4Ds9mIaFuKbWXE9XDkDkuSzSGmmz9PgRmtmKR4Dnu1fZM4BXqys%2Bw%2FSF6cDvLv%2FCUrrG4alZsYZtMx57Qe4RU8aKKCIM6%2FSKY0PyRp8zJPJsZug7Ec1x%2F2o%2BbgGkOhqi0vi4G7Z2tYxPBAyrdJJNSjszJS6GgTTB051uMbaoSJLyQww11EKn0yU3W4uzjfmTsf%2BHo30bj6hjOdlRKY68dSVXGHIA31jZNAM%3D--3Z

However, when running with curl directly in the terminal, I get:

curl: (6) Could not resolve host: target.helpshift.com

If, I edit and resend through firefox it works. But it won't be practical to run that on each of the 10000 pages I am researching.

How can I get this json through curl? maybe the problem is https?! Can I make Firefox send those calls without editing it one by one?

EDIT:

  • I am able to ping target.helpshift.com from the terminal without proxy
  • removing -H 'Host: target.helpshift.com' didn´t change the result
Community
  • 1
  • 1
DeMarco
  • 599
  • 1
  • 8
  • 26
  • 1
    No, the problem is not HTTPS – it’s that cURL can’t resolve the host name to an IP address. – CBroe Jan 19 '16 at 19:42
  • Does Firefox use a proxy ? If yes configure the same for curl. Are you able to ping that host ? – Marged Jan 19 '16 at 19:43
  • Why are you passing a `Host` header? That exact host name is already in the URL. Have you tried it without this header? – JimmyJames Jan 19 '16 at 19:52
  • @JimmyJames I just copied from the firefox inspecting network option. I thought that was the best way try curl was to use that (since it was generated by firefox who was being able to send those ajax requests) – DeMarco Jan 19 '16 at 20:10
  • Try without it and see if you can ping the host as Marged suggested. – JimmyJames Jan 19 '16 at 20:17
  • @Marged see the udpated question. – DeMarco Jan 19 '16 at 21:13
  • 1
    Have you tried forcing ipv4 by adding -4 ? What is the result when run with -v ? – Marged Jan 20 '16 at 04:47
  • @Marged thanks, I cannot believe that the problem was that simple. – DeMarco Jan 20 '16 at 05:19
  • I found the solution by searching on stack overflow ;-) – Marged Jan 20 '16 at 05:47
  • Possible duplicate of [CURL and HTTPS, "Cannot resolve host"](http://stackoverflow.com/questions/1341644/curl-and-https-cannot-resolve-host) – Marged Jan 20 '16 at 05:48

0 Answers0