39

I have this parameters to download a XML file:

wget --http-user=user --http-password=pass http://www.example.com/file.xml

How I have to use that in php to open this xml file?

Donatas Veikutis
  • 974
  • 2
  • 15
  • 36
  • @nl-x i think that the selected answer (@Jack) provides an URI with authentication that is not documented (and might be not operational?) : PHP documentation tells that you have to use `stream_context_create` when authentication is needed with `file_get_contents()` – Adam Mar 22 '15 at 07:45
  • 1
    @Adam It's definitely operational - [proof](http://lxr.php.net/xref/PHP_5_6/ext/standard/http_fopen_wrapper.c#535); and I don't see that particular requirement in the documentation. – Ja͢ck Mar 23 '15 at 03:12
  • @nl-x For auth basic, it works, but not for auth-digest. – Adam Mar 23 '15 at 08:36
  • try http://www.unix.com/shell-programming-and-scripting/126490-wget-xml-isssue.html – Ravi Chauhan Mar 24 '15 at 12:46

10 Answers10

60

wget

wget is a linux command, not a PHP command, so to run this you woud need to use exec, which is a PHP command for executing shell commands.

exec("wget --http-user=[user] --http-password=[pass] http://www.example.com/file.xml");

This can be useful if you are downloading a large file - and would like to monitor the progress, however when working with pages in which you are just interested in the content, there are simple functions for doing just that.

The exec function is enabled by default, but may be disabled in some situations. The configuration options for this reside in your php.ini, to enable, remove exec from the disabled_functions config string.

alternative

Using file_get_contents we can retrieve the contents of the specified URL/URI. When you just need to read the file into a variable, this would be the perfect function to use as a replacement for curl - follow the URI syntax when building your URL.

// standard url
$content = file_get_contents("http://www.example.com/file.xml");

// or with basic auth
$content = file_get_contents("http://user:pass@www.example.com/file.xml");

As noted by Sean the Bean - you may also need to change allow_url_fopen to true in your php.ini to allow the use of a URL in this method, however, this should be true by default.

If you want to then store that file locally, there is a function file_put_contents to write that into a file, combined with the previous, this could emulate a file download:

file_put_contents("local_file.xml", $content);
Matt Clark
  • 27,671
  • 19
  • 68
  • 123
  • 3
    Excellent explanation, btw. I really appreciate this response. Prevented me from doing 2 more google searches. – ihaveitnow May 23 '14 at 08:15
  • 1
    Great explanation! Answers the question and provides an alternative that may be better, with examples of both. This should be the accepted answer. Also worth nothing again that `allow_url_fopen` must be enabled in your php.ini or else `file_get_contents` will not accept a URL. – Sean the Bean May 20 '16 at 20:17
  • 1
    Thank for the feedback, I have actually had this issue myself before! I updated the answer to include this fact, thanks! :D – Matt Clark May 21 '16 at 05:12
38

If the aim is to just load the contents inside your application, you don't even need to use wget:

$xmlData = file_get_contents('http://user:pass@example.com/file.xml');

Note that this function will not work if allow_url_fopen is disabled (it's enabled by default) inside either php.ini or the web server configuration (e.g. httpd.conf).

If your host explicitly disables it or if you're writing a library, it's advisable to either use cURL or a library that abstracts the functionality, such as Guzzle.

use GuzzleHttp\Client;

$client = new Client([
  'base_url' => 'http://example.com',
  'defaults' => [
    'auth'    => ['user', 'pass'],
]]);

$xmlData = $client->get('/file.xml');
Ja͢ck
  • 170,779
  • 38
  • 263
  • 309
  • 1
    I think cURL is a better Option than file_get_contents. Because allow_url_fopen is always problem. From security point of view that may be '0' . – Ranjith Siji Mar 24 '15 at 05:30
  • The better option is to have it abstracted, actually :) – Ja͢ck Mar 25 '15 at 07:26
  • In addition to this answer, please refer to this post if you are working with large files. https://stackoverflow.com/questions/27492007/file-get-contents-large-file-upload – Oliver M Grech Jun 04 '19 at 11:43
11

You can use curl in order to both fetch the data, and be identified (for both "basic" and "digest" auth), without requiring extended permissions (like exec or allow_url_fopen).

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/file.xml");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
curl_setopt($ch, CURLOPT_USERPWD, "user:pass");
$result = curl_exec($ch);
curl_close($ch);

Your result will then be stored in the $result variable.

Adam
  • 17,838
  • 32
  • 54
  • Note that even though it's not affected by allow_url_fopen, it does require the cURL extension to be enabled. – Ja͢ck Mar 24 '15 at 05:56
2

lots of methods available in php to read a file like exec, file_get_contents, curl and fopen but it depend on your requirement and file permission

Visit this file_get_contents vs cUrl

Basically file_get_contents for for you

$data = file_get_contents($file_url);
Navneet Garg
  • 1,364
  • 12
  • 29
2

If using Curl in php...

function disguise_curl($url) 
{ 
  $curl = curl_init(); 

  // Setup headers - I used the same headers from Firefox version 2.0.0.6 
  // below was split up because php.net said the line was too long. :/ 
  $header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,"; 
  $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"; 
  $header[] = "Cache-Control: max-age=0"; 
  $header[] = "Connection: keep-alive"; 
  $header[] = "Keep-Alive: 300"; 
  $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7"; 
  $header[] = "Accept-Language: en-us,en;q=0.5"; 
  $header[] = "Pragma: "; // browsers keep this blank. 

  curl_setopt($curl, CURLOPT_URL, $url); 
  curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/bot.html)'); 
  curl_setopt($curl, CURLOPT_HTTPHEADER, $header); 
  curl_setopt($curl, CURLOPT_REFERER, 'http://www.google.com'); 
  curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate'); 
  curl_setopt($curl, CURLOPT_AUTOREFERER, true); 
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 
  curl_setopt($curl, CURLOPT_TIMEOUT, 10); 

  $html = curl_exec($curl); // execute the curl command 
  curl_close($curl); // close the connection 

  return $html; // and finally, return $html 
} 

// uses the function and displays the text off the website 
$text = disguise_curl($url); 
echo $text; 
?> 
Ravi Chauhan
  • 1,409
  • 13
  • 26
2

This method is only one class and doesn't require importing other libraries or reusing code.

Personally I use this script that I made a while ago. Located here but for those who don't want to click on that link you can view it below. It lets the developer use the static method HTTP::GET($url, $options) to use the get method in curl while being able to pass through custom curl options. You can also use HTTP::POST($url, $options) but I hardly use that method.

/**
  *  echo HTTP::POST('http://accounts.kbcomp.co',
  *      array(
  *            'user_name'=>'demo@example.com',
  *            'user_password'=>'demo1234'
  *      )
  *  );
  *  OR
  *  echo HTTP::GET('http://api.austinkregel.com/colors/E64B3B/1');
  *                  
  */

class HTTP{
   public static function GET($url,Array $options=array()){
    $ch = curl_init();
    if(count($options>0)){
       curl_setopt_array($ch, $options);
       curl_setopt($ch, CURLOPT_URL, $url);
       $json = curl_exec($ch);
       curl_close($ch);
       return $json;
     }
   }
   public static function POST($url, $postfields, $options = null){
       $ch = curl_init();
       $options = array(
          CURLOPT_URL=>$url,
          CURLOPT_RETURNTRANSFER => TRUE,
          CURLOPT_POSTFIELDS => $postfields,
          CURLOPT_HEADER => true
          //CURLOPT_HTTPHEADER, array('Content-Type:application/json')
          ); 
       if(count($options>0)){
           curl_setopt_array($ch, $options);
       }
       $json = curl_exec($ch);
       curl_close($ch);
       return $json;
   }
}
Austin Kregel
  • 715
  • 1
  • 8
  • 27
  • You don't need to set `CURLOPT_CUSTOMREQUEST` for something standard like POST; just set `CURLOPT_POSTFIELDS` and it will automatically set the right method ... also, you could use [`curl_setopt_array()`](http://php.net/curl_setopt_array) instead of loops. – Ja͢ck Mar 25 '15 at 07:29
  • That's very true, I made this when I first learned about curl and how it works. At the time, I didn't know about `curl_setopt_array()`, I'll update my answer just because I have been to lazy to update my actual script to use this method :3 – Austin Kregel Mar 25 '15 at 14:10
1

Shellwrap is great tool for using the command-line in PHP!

Your example can be done quite easy and readable:

use MrRio\ShellWrap as sh;

$xml = (string)sh::curl(['u' => 'user:pass'], 'http://example.com/file.xml');
Limon Monte
  • 52,539
  • 45
  • 182
  • 213
0

I understand you want to open a xml file using php. That's called to parse a xml file. The best reference is here.

http://php.net/manual/de/function.xml-parse.php

0
<?php
function wget($address,$filename)
{
  file_put_contents($filename,file_get_contents($address));
}
?>

use:

<?php
wget(URL, FileName);
?>
  • 1
    Please add some explain how this answer help OP or others in fixing current issue instead of posting just code as answer.Thanks – ρяσѕρєя K May 09 '16 at 10:33
  • The question requires the use of http authentication which this answer does not cover. Re-implementing `wget` in PHP should be avoided if `wget` already exists on the system. – apokryfos May 09 '16 at 12:18
  • Bad Practice, file_get_contents will get all the contents in memory prior of writing it with file_put_contents, thus could exhaust the server with large files – Oliver M Grech Jun 04 '19 at 11:38
0

To run wget command in PHP you have to do following steps :

1) Allow apache server to use wget command by adding it in sudoers list.

2) Check "exec" function enabled or exist in your PHP config.

3) Run "exec" command as root user i.e. sudo user

Below code sample as per ubuntu machine

#Add apache in sudoers list to use wget command
~$ sudo nano /etc/sudoers
#add below line in the sudoers file
www-data ALL=(ALL) NOPASSWD: /usr/bin/wget


##Now in PHP file run wget command as 
exec("/usr/bin/sudo wget -P PATH_WHERE_WANT_TO_PLACE_FILE URL_OF_FILE");
vinod
  • 2,850
  • 1
  • 18
  • 23