0

I am trying to automate the download of report http://www.wesm.ph. However, the page generates the file on demand and the download URL will generate something like this:

http://www.wesm.ph/download.php?download=TUJBT1JURF8yMDE3LTA3LTI2XzIwMTctMDctMjZfR19MVVpPTi5jc3Y=

Is it possible to automate this? Thank you.

Pravitha V
  • 3,308
  • 4
  • 33
  • 51
Xehanort
  • 3
  • 1
  • The above `download` parameter is a Base64-encoded string. This decodes to `MBAORTD_2017-07-26_2017-07-26_G_LUZON.csv`. Assuming that you **know** exactly what the target files are going to be, then yes, it would be possible to build a script to encode the target file as Base64 and then affix it to the URL :) – Obsidian Age Aug 03 '17 at 04:52
  • Sorry, I am quite new with this but how do I do this? I suppose the parameter building will be the most challenging part of this. – Xehanort Aug 03 '17 at 05:42

1 Answers1

1

This script will grab the latest file.

Set $folderPath to the folder where the CSV's are saved.

$folderPath = 'C:\Users\Michael\Downloads'

# If I can't download something from the last two weeks 
# then use this flag to track the error. 
$downloadSucceeded = $false

function giveBinaryEqualFile ([string] $myInput, [string] $fileName)
{
    $Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
    [System.IO.File]::WriteAllText($fileName, $myInput, $Utf8NoBomEncoding)
}

function generateAddress ([DateTime] $myDate)
{
    $localTimeZone = [System.TimeZoneInfo]::Local
    $PhilippineTimeZone = [System.TimeZoneInfo]::FindSystemTimeZoneById("China Standard Time")

    $PhilippineNow = [System.TimeZoneInfo]::ConvertTime($myDate, $localTimeZone, $PhilippineTimeZone)

    # Address
    $address = "http://www.wesm.ph/download.php?download="

    # What is the file name? 
    $dateInName = Get-Date -Date $PhilippineNow -Format 'yyyy-MM-dd'
    $nameInURL = "MBAORTD_{0}_{0}_G_LUZON.csv" -f $dateInName
    $fileName  =     "RTD_{0}_{0}_G_LUZON.csv" -f $dateInName

    # Base64 Encode
    $byteArray = [System.Text.Encoding]::UTF8.GetBytes($nameInURL)
    $encodedFileName = [System.Convert]::ToBase64String($byteArray)

    # URL
    $url = $address + $encodedFileName

    # Object 
    $properties = @{
        'address'  = $url
        'fileName' = $fileName
    }

    New-Object PSObject -Property $properties
}


# Try to download the latest file. 
# Search the last two weeks. 
:latest for($i=0; $i -ge -14; $i--)
{
    $localNow = (Get-Date).AddDays($i)
    $name = generateAddress $localNow
    $myRequest = Invoke-WebRequest -Uri $name.address

    # Skip this URL if the file length is zero. 
    if ($myRequest.RawContentLength -eq 0)
    {  continue latest  }

    # Skip this URL if we get the 404 page. 
    foreach ($element in $myRequest.AllElements) 
    {   
        if ($element.class -eq 'error')
        {  continue latest  }
    }

    # We did not see an error message. 
    # We must have the file. 

    # Save the file. 
    $myPath = Join-Path $folderPath ($name.fileName)
    if (Test-Path -Path $myPath )
    {
        Write-Host "$($name.fileName) already exists. Exiting. "
        exit
    }
    else
    {  giveBinaryEqualFile ($myRequest.Content) $myPath  }

    # Record success. 
    $downloadSucceeded = $true

    # Leave the loop. 
    break latest
}

if ($downloadSucceeded)
{
    Write-Host "The download succeeded."
    Write-Host "File Name: $($name.fileName)"
}
else
{
    Write-Host "The download failed."
    Write-Host "No files available from the last two weeks. "
}

Downloading the file using a Web browser and downloading using a HtmlWebResponseObject produce different files. The content is the same. But the encoding differs. And PowerShell formatters add a newline. So I Removed the BOM and newline. And you can reuse my giveBinaryEqualFile() function to fix formatting problems in other scripts.

Make sure we're using the Philippine time zone. Another example.

Encode the URL per Obsidian Age's comment.

And use labels to break out of loops early.

  • Wow, this just works perfectly. I just need to adjust the date since the report is delayed by around a week. Thank you very much! – Xehanort Aug 04 '17 at 00:17
  • No adjustment needed. I noticed that this would get the latest files from the last two weeks. You're the real MVP! – Xehanort Aug 04 '17 at 00:24