0

I want to use cron to do a daily download of portfolio info with 2 added complications:

  1. It needs a password
  2. I want to get the format I can get, when on the site myself, by clicking on "Download to a Spreadsheet

If I use:

wget -U Chromium --user='e-address' --password='pass' \
    https://www.google.com/finance/portfolio > "file_"`date +"%d-%m-%Y"`+.csv

I Get the response:

========================================================================= 
--2013-10-20 12:16:13--  https://www.google.com/finance/portfolio 
Resolving www.google.com (www.google.com)... 74.125.195.105, 74.125.195.103, 74.125.195.99, ... 
Connecting to www.google.com (www.google.com)|74.125.195.105|:443... connected. 
HTTP request sent, awaiting response... 200 OK 
Length: unspecified [text/html] 
Saving to: ‘portfolio’ 

[ <=>                                   ] 16,718      --.-K/s   in 0.04s   

2013-10-20 12:16:13 (431 KB/s) - ‘portfolio’ saved [16718] 
==========================================================================

It saves to a file called "portfolio" rather than where I asked it to ("file_"date +"%d-%m-%Y"+.csv). When I look at "portfolio" in the browser it says I need to sign in to my account ie no notice is taken of the user and password information I've included.

If I add to the web address the string I get by hovering on the "Download to a Spreadsheet" link:-

wget -U Chromium --user='e-address' --password='pass' \
    https://www.google.com/finance/portfolio?... > "file_"`date +"%d-%m-%Y"`+.csv

I get:

[1] 5175 
[2] 5176 
[3] 5177 
[4] 5178 
--2013-10-20 12:44:56--  https://www.google.com/finance/portfolio?pid=1 
Resolving www.google.com (www.google.com)... [2]   Done                    output=csv 
[3]-  Done                    action=view 
[4]+  Done                    pview=pview 
hg21@hg21-sda2:~$ 74.125.195.106, 74.125.195.103, 74.125.195.104, ... 
Connecting to www.google.com (www.google.com)|74.125.195.106|:443... connected. 
HTTP request sent, awaiting response... 200 OK 
Length: unspecified [text/html] 
Saving to: ‘portfolio?pid=1’ 

[ <=>                                   ] 16,768      --.-K/s   in 0.05s   

2013-10-20 12:44:56 (357 KB/s) - ‘portfolio?pid=1.1’ saved [16768] 

and at this point it hangs. The file it writes at this point (‘portfolio?pid=1’) is the same as the 'portfolio' file with the previously used wget.

If I then put in my password it continues:

pass: command not found 
[1]+  Done                    wget -U Chromium --user="e-address" --password='pass' https://www.google.com/finance/portfolio?pid=1 
[1]+  Done                    wget -U Chromium --user="e-address" --password='pass' https://www.google.com/finance/portfolio?pid=1 

Any help much appreciated.

StvnW
  • 1,772
  • 13
  • 19

1 Answers1

0

There are a couple of issues here:

1) wget is not saving to the correct filename

Use the -O option instead of > shell redirection.

Change > file_`date+"%d-%m-%Y"`.csv to -O file_`date+"%d-%m-%Y"`.csv

Tip: If you use date+"%Y-%m-%d", your files will naturally sort chronologically.

This is esssentially a duplicate of wget command to download a file and save as a different filename

See also man wget for options.

2) wget is spawning multiple processes and "hanging"

You have &s in your URL which are being interpreted by the shell instead of being included in the argument passed to wget. You need to wrap the URL in quotation marks.

https://finance.google.com/?...&...&...

becomes

"https://finance.google.com/?...&...&..."
Community
  • 1
  • 1
StvnW
  • 1,772
  • 13
  • 19
  • There may be other issues with establishing the secure connection and getting it to accept the username and password, but you'll need to resolve these first two issues and then see where that gets you. – StvnW Oct 21 '13 at 20:01