1

I'm able to download the file successfully using the following curl command.

curl -u user:pass -k "https://website.com/remote/path/remotefile.zip" -o /local/path/file.zip

Ideally, I'd like to automate this by getting the latest file programmatically. It's possible for more than one file to be uploaded per day. Each file's name is prefixed with a timestamp and there are multiple files within the directory.

Example of directory contents and filenames:

20191102230243952_Appended-Constant_Filename.zip
20191103040135476_Appended-Constant_Filename.zip
20191103100132360_Appended-Constant_Filename.zip

Update from @FedonKadifeli's comment. If a request is made to the directory containing the files, the HTML output listing the files is returned.

curl -u user:pass -k "https://website.com/remote/path"

HTML Output

[...]
<table>
    <tr>
        <td align="left">&nbsp;&nbsp;
            <a href="/remote/path/20191102230243952_Appended-Constant_Filename.zip"><tt>20191102230243952_Appended-Constant_Filename.zip</tt></a>
        </td>
        <td align="right"><tt>66.6 kb</tt></td>
        <td align="right"><tt>Sun, 03 Nov 2019 06:02:44 GMT</tt></td>
    </tr>
    <tr bgcolor="#eeeeee">
        <td align="left">&nbsp;&nbsp;
            <a href="/remote/path/20191103040135476_Appended-Constant_Filename.zip"><tt>20191103040135476_Appended-Constant_Filename.zip</tt></a>
        </td>
        <td align="right"><tt>66.6 kb</tt></td>
        <td align="right"><tt>Sun, 03 Nov 2019 12:01:35 GMT</tt></td>
    </tr>
    <tr>
        <td align="left">&nbsp;&nbsp;
            <a href="/remote/path/20191103100132360_Appended-Constant_Filename.zip"><tt>20191103100132360_Appended-Constant_Filename.zip</tt></a>
        </td>
        <td align="right"><tt>66.5 kb</tt></td>
        <td align="right"><tt>Sun, 03 Nov 2019 18:01:32 GMT</tt></td>
    </tr>
</table>
[...]
Blaine
  • 2,293
  • 1
  • 13
  • 16
  • This seems only possible if you can list the contents of the `path` folder. Does the request for URL `https://website.com/remote/path/` return the correct file list? – FedKad Nov 03 '19 at 20:17
  • @FedonKadifeli It returns an HTML listing of all files within the directory. See update to question. – Blaine Nov 03 '19 at 20:49
  • Something not matching. The file were named /remote/path/remotefile.zip, but the listing shows /remote/path20191103040135476_Appended-Constant_Filename.zip. – dash-o Nov 03 '19 at 21:16
  • If the server is running Apache, you might have additional option on the listing: ?C=N (sort by name), and F=0 (simple listing, no HTML), which can simplify the parsing significantly. – dash-o Nov 03 '19 at 21:19
  • @dash-o that was a typo on my part in scrubbing the path details, updated. Thanks for the suggestion on adding the Apache options. I can hit the URL in a browser and tried adding the parameters like so https://website.com/remote/path?C=N;F=0, but the output is the same. – Blaine Nov 03 '19 at 22:06

1 Answers1

1

A small SED script can extract the file name from the listing. Sort/head will get the latest file name

P=https://website.com/remote/path
curl .. "$P/" > listing.txt
LATEST=$(sed -ne '/href=/{s@.*href=".*/\([0-9]\+_[^"]\+\).*@\1@p}' < listing.txt | sort -nr | head -1)
curl ... "$P/$LATEST"
dash-o
  • 13,723
  • 1
  • 10
  • 37
  • Thanks! I'm getting this when I run it, `sed: 1: "/href=/{s@.*href=".*/\( ...": bad flag in substitute command: '}'` – Blaine Nov 04 '19 at 13:46
  • Can you specify OS, and shell ? It works on Mint 19/Bash – dash-o Nov 04 '19 at 13:52
  • You're right! I threw it on the linux server and it works as expected. I'm running macOS 10.15.0/Bash locally and that's where I run into the issue. Any idea why? – Blaine Nov 04 '19 at 16:08
  • @Blaine I do not have MacOS. Can you get 'sed' & shell(bash?)version ? – dash-o Nov 04 '19 at 18:28
  • For MacOS can you try simplied sed `sed -ne '/href=/s@.*href=".*/\([0-9]\+_[^"]\+\).*@\1@p'` – dash-o Nov 04 '19 at 18:32
  • Here's the version of bash, 3.2.57(1)-release. Not exactly sure how to get the version of sed on a Mac. Tried the simplified version and it pulls the contents of listing.txt into output file rather than the actual file contents. – Blaine Nov 04 '19 at 23:43
  • @Blaine `sed --version` – dash-o Nov 05 '19 at 01:03
  • Tried `sed --version` but it doesn't seem like it's possible on a Mac, https://stackoverflow.com/questions/37639496/how-can-i-check-the-version-of-sed-in-os-x – Blaine Nov 05 '19 at 13:59