Crontab executing python script executing wget makes log go crazy with thousands of lines

Question

I have a Python script that executes wget to download files. Cron is in charge of executing the script periodically and I redirect the output to a file so that I can see the log whenever I want:

* * * * * /mnt/scripts/cronjobs/downloader.sh >> /home/user/downloader.log 2>&1

If I manually execute the script I can see the progress of the wget in a nice not overflowing way. Something like this:

filename.extension ===========================================================> 45MB/s 34s

And it gets updated single line, without creating thousands of lines, which is really nice.

But when I tail -f the output of the cron, I see something like this:

113650K .......... .......... .......... .......... ..........  3% 48.7M 96s
113700K .......... .......... .......... .......... ..........  3% 58.9M 96s
113750K .......... .......... .......... .......... ..........  3% 46.1M 96s
113800K .......... .......... .......... .......... ..........  3% 61.4M 96s
113850K .......... .......... .......... .......... ..........  3% 84.9M 96s
113900K .......... .......... .......... .......... ..........  3% 54.9M 96s
113950K .......... .......... .......... .......... ..........  3% 47.4M 96s
114000K .......... .......... .......... .......... ..........  3% 59.5M 96s
114050K .......... .......... .......... .......... ..........  3% 73.2M 96s
114100K .......... .......... .......... .......... ..........  3% 55.5M 96s
114150K .......... .......... .......... .......... ..........  3% 70.0M 96s
114200K .......... .......... .......... .......... ..........  3% 56.9M 96s
114250K .......... .......... .......... .......... ..........  3% 74.0M 96s
114300K .......... .......... .......... .......... ..........  3% 65.9M 96s
114350K .......... .......... .......... .......... ..........  3% 70.7M 96s
114400K .......... .......... .......... .......... ..........  3% 45.5M 96s
114450K .......... .......... .......... .......... ..........  3% 78.0M 96s

And it keeps creating thousands of lines, so it's really difficult to keep track of the progress and the file keeps growing huge.

Is there any way to avoid stacking up thousands of lines of this output? If so, how?

Thanks in advance!

Your question doesn't seem to be about Python at all. Why are you using `wget` instead of a regular Python HTTP fetch? — tripleee, May 16 '21 at 16:08
@tripleee I was already using `wget` before migrating to Python. I just decided to continue using `wget` with `subprocess` as I already knew that was working well. Also, downloading files and potentially resuming paused and/or stopped downloads on Python is way more annoying than doing so in a one line `wget`. — anonymous, May 16 '21 at 16:14

score 0 · Answer 1 · answered May 16 '21 at 16:05

0

wget does that when you run it, it's just less obvious when you view it on your screen.

The simple fix is to use wget -q.

answered May 16 '21 at 16:05

tripleee

175,061
34
275
318

I don't really want to add `-q` because that would disable the output and I don't want that... When I execute the script directly `python3 script.py`, the output of `wget` does not generate thousands of lines. It updates the same line with some sort of progress. – anonymous May 16 '21 at 16:13
Yes, but it does that by overwriting the same line over and over. Try with `-nv` if `-q` is too quiet. Probably check the manual page before asking. – tripleee May 16 '21 at 16:14
This is what I'm using atm: `subprocess.call(['wget', '-nc', '-U', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', link_url, '-P', download_path, '-q', '--show-progress'])` – anonymous May 16 '21 at 16:23
1

Why do you have `--show-progress` if you don't want the progress output? – tripleee May 16 '21 at 16:31
I want an "all in one" solution. If I execute the script manually I want to see progress. If it's cron who executes the script and I want to take a look at the log, I want to be able to see it as well. I just don't want hundreds and hundreds of lines to display progress. – anonymous May 16 '21 at 16:33
That's not hard to do, but not really in scope for this question any longer. Probably see https://stackoverflow.com/questions/858623/how-to-recognize-whether-a-script-is-running-on-a-tty – tripleee May 16 '21 at 16:49

score 0 · Answer 2 · answered May 18 '21 at 12:58

According to man wget you might use following to log to file rather than stderr (default behavior):

-o logfile Log all messages to logfile. The messages are normally reported to standard error.

-a logfile Append to logfile. This is the same as -o, only it appends to logfile instead of overwriting the old log file. If logfile does not exist, a new file is created.

Example usage wget -a example.log https://www.example.com. Unlike using -q you will be able to investigate if something went wrong.

Crontab executing python script executing wget makes log go crazy with thousands of lines

2 Answers2