-1

I want to download these files that have checkboxes http://www.pse.com.ph/stockMarket/marketInfo-marketActivity.html?tab=4 but I don't know what to add in my code.

<?php

$dlurl = 'http://www.pse.com.ph/stockMarket/marketInfo-marketActivity.html?tab=4';

$saveTo = 'C:\Users\Test\Desktop\phpfiles\datena.pdf';

$fp = fopen($saveTo, 'w+');

if($fp == false){
    throw new Exception('Could not open:' .$saveTo);
}

$ch = curl_init($dlurl);

curl_setopt($ch, CURLOPT_FILE, $fp);

curl_exec($ch);

?>
Ahmed Ashour
  • 5,179
  • 10
  • 35
  • 56
  • The link is very slow to load. Are you possibly timing out? Do you see any curl errors in your script or other symptoms of error? – Milo LaMar Apr 04 '17 at 02:17
  • The link is ok when I tested it. The files that I have to download are in checkboxes and don't know what to add in my code. – Newbieprog Apr 04 '17 at 02:40

1 Answers1

0

That page it's really a form. When you press the download button it sends the form within GET method changing the browser to a new url like this:

http://www.pse.com.ph/stockMarket/marketInfo-marketActivity-marketReports.html?ajax=true&method=downloadMarketReports&ids=[%22PSE_DQTRT20173306%22]

The ids parameter, contains one or more ids of the documents. If you select only one checkbox the PDF it's downloaded directly, if you select more than one the server provides you a zip with all the documents selected.

In your code you should change the url to the desired one.

I think you want to download all the documents. Isn't it ?

  1. You should download first within curl the webpage.

  2. After that you need to parse the webpage with regular expressions looking for the IDS of each document. PHP Parse HTML code

  3. And when you have they do a new cURL (like the used on 1) with the download url and the desired Ids like which I posted you before.

Community
  • 1
  • 1
NetVicious
  • 3,848
  • 1
  • 33
  • 47
  • I'm trying to download just a single file. The value of the checkboxes are hidden and I've done a little research that I should use jquery to find hidden attributes. But if you have a better idea can you please help me. – Newbieprog Apr 06 '17 at 09:51
  • I'll try this when i get home. And i'll try to search if I can find hidden value using html dom. tnx in advance – Newbieprog Apr 06 '17 at 14:48
  • As you say the ids are hidden. It seems they're a consecutive number. Today it's PSE_DQTRT20173306 and yesterday was PSE_DQTRT20173305. You should check tomorrow if the ids change or not. If they change you should sense which number will be the next time you ask having the old one saved somewhere (file or database). – NetVicious Apr 06 '17 at 15:45
  • Thank you for giving me an idea. :) – Newbieprog Apr 06 '17 at 23:04