20

I am working in a script, where I am able to browse the web content or the 'url' but I am not able to copy the web content in it & download as a file. This is what I have made so far:

$url = "http://sp-fin/sites/arindam-sites/_layouts/xlviewer.aspx?listguid={05DA1D91-F934-4419-8AEF-B297DB81A31D}&itemid=4&DefaultItemOpen=1"
$ie=new-object -com internetexplorer.application
$ie.visible=$true
$ie.navigate($url)
while($ie.busy) {start-sleep 1} 

How can I copy the content of $url and save it to local drive as a file?

Update:

I got these errors:

Exception calling "DownloadFile" with "2" argument(s): "The remote server returned an error: (401) Unauthorized." At :line:6 char:47 + (New-Object system.net.webclient).DownloadFile( <<<< "$url/download-url-content", 'save.html' )

Missing ')' in method call. At :line:6 char:68 + (New-Object system.net.webclient).DownloadFile( "$url", 'save.html' <<<<

Exception calling "DownloadFile" with "2" argument(s): "The remote server returned an error: (401) Unauthorized." At :line:6 char:47 + (New-Object system.net.webclient).DownloadFile( <<<< "$url", 'save.html' )

Ok, let me explain more, on what I am trying to do: I have a excel file in our share point site & this is the file I am trying to download locally(any format), which is a part of the script, so that for the later part of the script, I can compare this file with other data & get an output.

Now if I can somehow map "my documents" from the site & able to download the file, that will also work for me.

Community
  • 1
  • 1
Arindam
  • 201
  • 1
  • 3
  • 5
  • 1
    Your answers to this post should not be there, that's why somebody voted them down. Instead of that you should edit the original question. – stej Jan 05 '10 at 13:28

7 Answers7

32

Update Jan 2014: With Powershell v3, released with Windows 8, you can do this:

 (Invoke-webrequest -URI "http://www.kernel.org").Content

Original Post, valid for Powershell Version 2

This solution is very similar to the other answers from stej, Jay Bazusi and Marco Shaw. It is a bit more general, by installing a new module into your module directory, psurl. The module psurl adds new commands in case you have to do a lot of html-fetching (and POSTing) with powershell.

(new-object Net.WebClient).DownloadString("http://psget.net/GetPsGet.ps1") | iex

See the homepage of the code-sharing website http://psget.net/.

This nice line of PowerShell script will dowload GetPsGet.ps1 and send it to Invoke-Expression to install PsGet Module.

Then install PsUrl, a Powershell Module inspired by curl:

To install something (in our case PsUrl) from central directory just type:

install-module PsUrl

get-module -name psurl

Output:

ModuleType Name                      ExportedCommands
---------- ----                      ----------------
Script     psurl                     {Get-Url, Send-WebContent, Write-Url, Get-WebContent}

Command:

get-command -module psurl

Output:

CommandType     Name                                                Definition
-----------     ----                                                ----------
Function        Get-Url                                             ...
Function        Get-WebContent                                      ...
Alias           gwc                                                 Get-WebContent
Function        Send-WebContent                                     ...
Alias           swc                                                 Send-WebContent
Function        Write-Url                                           ...

You need to do this only once.

Note that this error might occur:

Q: Error "File xxx cannot be loaded because the execution of scripts is disabled on this system. Please see "get-help about_signing" for more details."

A: By default, PowerShell restricts execution of all scripts. This is all about security. To "fix" this run PowerShell as Administrator and call

Set-ExecutionPolicy RemoteSigned

From now on, in your new powershell sessions/scripts, do this:

import-module psurl
get-url "http://www.google.com"

To download and save to a file, do this:

get-url "http://www.google.com" | out-file -filepath "myfile.html"
knb
  • 9,138
  • 4
  • 58
  • 85
  • For the PowerShell 3+ solution, this works for saving to a file: `Invoke-WebRequest -URI "http://www.kernel.org" -OutFile file.txt` – Phil Deets Feb 11 '23 at 04:21
27

As I understand it, you try to use IE because if automatically sends your credentials (or maybe you didn't know of any other option).

Why the above answers don't work is because you try to download file from SharePoint and you send an unauthenticated request. The response is 401.

This works:

PS>$wc=new-object system.net.webclient
PS>$wc.UseDefaultCredentials = $true
PS>$wc.downloadfile("your_url","your_file")

if the the current user of Posh has rights to download the file (is the same as the logged one in IE).

If not, try this:

PS>$wc=new-object system.net.webclient
PS>$wc.Credentials = Get-Credential
PS>$wc.downloadfile("your_url","your_file")
stej
  • 28,745
  • 11
  • 71
  • 104
  • 2
    This absolutely is the answer to the Unauthorized issue...not sure why the other answers have more up votes – w4ik Dec 14 '11 at 16:00
  • Is there any way to pass the username and password without interactively requesting to user on runtime? – arvindwill Jan 12 '14 at 06:12
  • Yes, you can export the credentials to the disk, but then.. it's not too secure, you know. – stej Jan 15 '14 at 07:55
14

If you just want to download web content, use

(New-Object System.Net.WebClient).DownloadFile( 'download url content', 'save.html' )
Community
  • 1
  • 1
Jay Bazuzi
  • 45,157
  • 15
  • 111
  • 168
11

I'm not aware of any way to save using that interface.

Does this render the page properly:

PS>$wc=new-object system.net.webclient
PS>$wc.downloadfile("your_url","your_file")
Marco Shaw
  • 1,117
  • 6
  • 13
3

As already answered in https://stackoverflow.com/a/35202299/4636579, but with a mandatory Proxy and the credentials. Without proxy, it would be:

$url="http://aaa.bbb.ccc.ddd/rss.xml"

$WebClient = New-Object net.webclient

$path="C:\Users\hugo\xml\test.xml"

$WebClient.DownloadFile($url, $path)
Community
  • 1
  • 1
Coliban
  • 601
  • 2
  • 9
  • 24
  • @Undo: Why have you edited my post, i do not see any difference to my original. – Coliban Feb 18 '16 at 14:39
  • 1
    I formatted your code to make it more readable. See http://stackoverflow.com/posts/35435733/revisions and http://stackoverflow.com/help/formatting. – Undo Feb 18 '16 at 14:42
2
$web = New-Object Net.WebClient

$web | Get-Member

$content=$web.DownloadString("http://www.bing.com")
1

If you're truly only concerned with the raw string content, the best route, as mentioned by a few others, is using the constructs within .NET to do this. However, I think in the previous answers a few opportunities are missed.

  • It's often best to use WebRequest over WebClient as it provides better control over the entire request cycle
  • Response buffering via System.IO.StreamReader, made possible by using WebRequest
  • Creating a testable, reusable tool. Which is the very nature and purpose of PowerShell
function Get-UrlContent {
    <#
    .SYNOPSIS
        High performance url fetch

    .DESCRIPTION
        Given a url, will return raw content as string.

        Uses: 
        System.Net.HttpRequest
        System.IO.Stream
        System.IO.StreamReader

    .PARAMETER Url
        Defines the url to download

    .OUTPUTS
        System.String

    .EXAMPLE
        PS C:\> Get-UrlContent "https://www.google.com"
        "<!doctype html>..."
    #>

    [cmdletbinding()]
    [OutputType([String])]
    param(
        [Parameter(Mandatory, ValueFromPipeline)]
        [ValidateNotNullOrEmpty()]
        [string] $Url)

    Write-Debug "`n----- [Get-UrlContent]`n$url`n------`n`n"

    $req = [System.Net.WebRequest]::CreateHttp($url)

    try {
        $resp = $req.GetResponse()    
    }
    catch {
        Write-Debug "`n------ [Get-UrlContent]`nDownload failed: $url`n------`n"
    }
    finally {
        if ($resp) {
            $st = $resp.GetResponseStream()
            $rd = [System.IO.StreamReader]$st

            $rd.ReadToEnd()     
        }

        if ($rd) { $rd.Close() }
        if ($st) { $st.Close() }
        if ($resp) { $resp.Close() }   
    }
}
pim
  • 12,019
  • 6
  • 66
  • 69