0

I am trying to download to txt file some website data but the actual data I need doesn't get loaded in the website for about 10 seconds later until after the video feed loads so when I run curl to grab the website data, it grabs the first bit then exits but this is not the data I need, I need the data that is running 10-15 seconds after the initial site loads.

  curl -L  http://mywebsite.com > output.txt

Is there a way to accomplish this? Thank you

matrixebiz
  • 97
  • 2
  • 10
  • 1
    If this data is loaded via Javascript you will not be able to get the data via cURL. Consider using something like a headless browser. Possible duplicate of [How to get webcontent that is loaded by JavaScript using cURL?](https://stackoverflow.com/questions/20554113/how-to-get-webcontent-that-is-loaded-by-javascript-using-curl) – My Head Hurts Nov 20 '18 at 15:05
  • Hello, no, it is just loading a m3u8 video but takes a bit to start after the initial page has loaded – matrixebiz Nov 20 '18 at 15:07
  • If you turn off javascript in your browser and visit the webpage, does everything work as you expect? – My Head Hurts Nov 20 '18 at 15:13
  • Ah, no, it doesn't. It wouldn't get past the first initial page loading to get to the video playing. Do I have to use something else? – matrixebiz Nov 20 '18 at 15:21
  • I would look into using a headless browser. It might sound intimidating, but they are not too scary once you get into them. They are available in a few different languages, there are some examples here: https://github.com/dhamaniasad/HeadlessBrowsers – My Head Hurts Nov 20 '18 at 15:29
  • Okay, so which one would I use and how would it work form command line? I am currently, as a workaround using a Chrome command to load the browser and export all the data to a text file then kill the Chrome process after 20 seconds. So my text file output from chrome is a lot bigger, but it now has all the info I am looking for. I want to get away from this browser loading batch file and just pull the data a less intrusive way. – matrixebiz Nov 20 '18 at 16:01
  • I'll try PhantomJS and see if it will do what I want. – matrixebiz Nov 20 '18 at 16:12

0 Answers0