5

I am working on creating a video timelapse. All the photos I took are .jpg images shot at 4:3 aspect ratio. 2592x1944 resolution. I want them all to be 16:9 at 1920x1080.

I have written a little script to do this, but the process is not very fast. It took about 17 minutes for me to crop and resize 750 images. I have a total of about 300,000 to deal with, and will probably be doing then in batches of about 50,000. That is 18 hours 45 minutes per batch, and over 4.5 days of computing total.

So does anyone know a way I can speed up this program?

here is the bash script I have written:

#!/bin/bash  

mkdir cropped

for f in *.JPG
do
    convert $f -resize 1920x1440 -set filename:name '%t' cropped/'%[filename:name].JPG' #Resize Photo, maintain aspect ratio
    convert cropped/$f -crop 1920x1080+0+$1 -set filename:name '%t' cropped/'%[filename:name].JPG' #Crop to 16:9 aspect ratio, takes in $1 argument for where to begin crop
done

echo Cropping Complete!

Putting some echo commands before and after each line within the loop reveals that resizing takes much more time than cropping, which I guess is not surprising. I have tried using mogrify -path cropped -resize 1920x1440! $f in place of convert $f -resizebut there does not seem to be much of a difference in speed.

So, any way I can speed up the runtime on this?

BONUS POINTS if you can show me an easy way to give a simple indication of progress as the program runs (something like "421 of 750 files, 56.13% complete").

EXTRA BONUS POINTS if you can add a command to output a .mp4 file from each frame that can be edited in a software program like SONY Vegas. I have managed to make video files (.avi) using mencoder from these photos, but the resulting video wont work in any video editors I have tried.

Brian C
  • 1,333
  • 3
  • 19
  • 36
  • You could try forking each iteration of the for loop, but I'm not sure how much speed you would actually gain from it. It may be worth trying. To go about that, I'd make a Bash function that performs the two lines of code in your for loop, then call the function in the for loop and put an ampersand (`&`) after it to fork it. – Luke Mat Mar 15 '15 at 09:26

3 Answers3

7

A few things spring to mind...

Firstly, don't start ImageMagick twice per image, once to resize it and once to crop it when it should be possible to do both operations in one go. So, instead of your two convert commands, I would do just one

convert image.jpg -resize 1920x1440 -crop 1920x1080+0+$1 cropped/image.jpg

Secondly, I don't see what you are doing with the set command, something with the filename, but you can just do that in the shell.

Thirdly, I would suggest you use GNU Parallel (I regularly process upwards of 65,000 images per day with it). It is easy to install and will ensure all those lovely CPU cores you paid for are kept busy. The easiest way to use it is, instead of running commands, just echo them and pipe them into parallel

#!/bin/bash  
mkdir cropped

for f in *.jpg
do
   echo convert \"$f\" -resize 1920x1440 -crop 1920x1080+0+$1 cropped/\"$f\"
done  | parallel

echo Cropping Complete!

Finally, if you want a progress meter, or indication of how much is done and what is left to do, use the --eta option (eta=Estimated Time of Arrival) to parallel and it tells you how many jobs and how much time is remaining.

When you get confident with parallel you will maybe run your entire process like this:

parallel --eta convert {} -resize 1920x1440 -crop 1920x1080+0+32 cropped/{} ::: *.jpg

I created 750 images the same size as yours and ran them this way and it takes my medium spec iMac 55 seconds to resize and crop the lot - YMMV. Please add a comment and say how you got on - how long the processing time is with parallel.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • Thanks. This did speed it up. But not by much... See the various code implementations I tried here: http://pastebin.com/ZfqAN8Rk The fastest method (parallel in a single command) took 360 seconds to do 750 files. That is only an improvement of 33% from the original code above, which takes 544 seconds (my original 17 min estimate was incorrect). Thats also 6.5x slower than your computer. I have a Quad-Core Intel i5 and 8GB of RAM. I might not have as fast a computer than you, but I think I should be able to manage better than this! – Brian C Mar 15 '15 at 22:10
  • Well 33% is a pretty good improvement! The i5 doesn't do HyperThreading like the i7 so my iMac essentially has 8 parallel threads of execution. I doubt there is much more improvement to be had on your machine - though you can use multiple machines with GNU Parallel if you have them. – Mark Setchell Mar 15 '15 at 22:23
  • Perhaps it is the best I can do. At least now it will take less than 7 hours to do 50k photos, so I can just leave it running overnight and have it done by when I wake up. Still, I was hoping for a bit bigger improvement, since yours was 10x faster and the other guy said I should be able to get it 4x faster. I'll look around a bit into speeding it up more, but thanks! – Brian C Mar 15 '15 at 22:40
2

Firstly in order to speed up don't echo stuff to the screen echo it to a file and if you want to know the status read the file (easily done with tail command), seriously this will already be faster. However this doesn't seem like the real bottleneck of your program. The main thing I can recommend is to run it in parallel, is there any reason why you can't crop+resize pic #1000 before pic #4? If not then modify the script to receive some parameter that specifies which files it should work on and then run it a few times with different parameters, this should cut down the time by about as many CPU cores as you have (minus some hard-drive I/O time). Regarding your first bonus question you can do a variant of this code

TOTAL=`ls -1|wc -l` #get the total number of files (you can change this to the files parameter I mentioned above
SOFAR=0 #How many files you've done so far
for f in *.JPG
do
    ((SOFAR++)) 
    echo "done so far $SOFAR out of $TOTAL"
done
in need of help
  • 1,606
  • 14
  • 27
  • Don't parse the output of `ls` (especially here it might give wrong results) and don't use uppercase variable names. – gniourf_gniourf Mar 15 '15 at 09:04
  • I gave a scehmatic script that needs to be changed to be in the original script. Regarding the upper case variable name that's the convention for Bash script varaibles at my place of work. – in need of help Mar 15 '15 at 09:05
  • Thanks. Taking out the echo commands (when it took 17 minutes, I was calling it 4 times within each iteration of the loop) sped it up quite a bit. But using parallel did not actually speed it up very much at all. Here are the 4 different implementations I tried. http://pastebin.com/ZfqAN8Rk As you can see - it still takes 360 seconds to do 750 photos, which is only ~33% faster than the code above (w/o echo) and only 12% faster than doing a for loop with 1 convert command, and not using parallel. I am certainly not getting the 4x speed up I should expect using my 4 cores. – Brian C Mar 15 '15 at 22:17
  • Firstly 33% sounds very good to me. Furthermore there is more I/O than I thought there is in such an operation and thus the hard-drive becomes a main factor in parallelism. I don't really have a way around that. – in need of help Mar 16 '15 at 08:54
0

Use the

-define jpeg:size=1920x1440

option along with -resize. If you have a older version of ImageMagick (sorry, I don't know exactly when the syntax changed), use the

-size 1920x1440

option along with -resize.

Glenn Randers-Pehrson
  • 11,940
  • 3
  • 37
  • 61
  • This is a duplicate answer.. I had already answered your duplicate question a few minutes ago! – Glenn Randers-Pehrson Mar 17 '15 at 22:01
  • sorry about that! I could not find a good way to update my question based off the new information I had from the answers here (Since the question is not quite so much "solved" and "could always be better". I could not think of a better way to look for more info than post again. Still Thanks for the reply! – Brian C Mar 18 '15 at 06:49