0

I use curl in C++ to download an html page from a website, then I save it. After I've saved the html file, with another programm I've to read it, and save it in a string.

This page contain some request (POST) made by JSON-AJAX. If I open it with the broswer I have the right content. If I open it with a text editor I have a bad content because the POST request is not made.

So how can I save the page whit the content obtained after JSON-AJAX request??

Luke
  • 565
  • 5
  • 19
Carme
  • 141
  • 1
  • 9
  • 1
    This is a job for a [headless browser](https://github.com/dhamaniasad/HeadlessBrowsers)! – Juan Tomas Jul 07 '16 at 19:46
  • I've looked for it on google but I can't fing an easy headless broswer to use in c++. Can you suggest me one? I've only need to save the content after it is processed by broswer. – Carme Jul 08 '16 at 02:07
  • Sorry, I haven't worked with a headless browser in ages. Poking around on SO, I do find some people having success with ajax requests in curl. One way to find out what a web page is doing is to install a plugin in your regular browser that lets you see all HTTP traffic ("live headers"). Once you can see what the ajax is doing, it's easier to duplicate the behavior using curl. But automating scrapes of pages where most of the content comes via ajax is a hassle, no matter how you approach it. Good luck! – Juan Tomas Jul 08 '16 at 15:16

1 Answers1

0

Curl will download the HTML code from the page and that's it. When you open the HTML file with a web browser, the browser is taking care of whatever post request is being sent.

You need to find out what the post request contains (i.e., the data and how it's obtained) and send that request separately and save the response.

You might want to look into this question How do you make a HTTP request with C++?

Community
  • 1
  • 1