I am trying to retrieve data from one site to be able to apply some models on them. And although, I was able to login with jsoup. It didn't get the content because it was loaded as JSON through AJAX.
Using firefox (Ubuntu) i got this curl after inspecting the xhr:
curl 'https://target.helpshift.com/xhr/view/issue-details/?publish_id=540672&viewing=1' -H 'Host: target.helpshift.com' -H 'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:41.0) Gecko/20100101 Firefox/41.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'X-Requested-With: XMLHttpRequest' -H 'Referer: https://target.helpshift.com/admin/issue/540672/' -H 'Cookie: _ga=GA1.3.1855716730.1445436566; _csrf_token=4mMuJX5jieMAdq_WN1elKiBh0415w-0TDxN_R6kx6SQ; _dc_gtm_UA-33692972-1=1; __hs=zHb6Mr4Ds9mIaFuKbWXE9XDkDkuSzSGmmz9PgRmtmKR4Dnu1fZM4BXqys%2Bw%2FSF6cDvLv%2FCUrrG4alZsYZtMx57Qe4RU8aKKCIM6%2FSKY0PyRp8zJPJsZug7Ec1x%2F2o%2BbgGkOhqi0vi4G7Z2tYxPBAyrdJJNSjszJS6GgTTB051uMbaoSJLyQww11EKn0yU3W4uzjfmTsf%2BHo30bj6hjOdlRKY68dSVXGHIA31jZNAM%3D--3Z
However, when running with curl directly in the terminal, I get:
curl: (6) Could not resolve host: target.helpshift.com
If, I edit and resend through firefox it works. But it won't be practical to run that on each of the 10000 pages I am researching.
How can I get this json through curl? maybe the problem is https?! Can I make Firefox send those calls without editing it one by one?
EDIT:
- I am able to ping
target.helpshift.com
from the terminal without proxy - removing
-H 'Host: target.helpshift.com'
didn´t change the result