0

I have more than a million images in a directory. These images have been taking over the years and want to create a timelapse with mencoder on a per day basis.

Files have their date un the following format: image_2015-07-19_14_48_47.951.jpg and also their timestamp is correct.

I'd like to run a script to classify these moving them into a directory structure as follows: yyyy/mm/dd/image_yyyy-mm-dd_hh_mm_ss.951.jpg

Also more files are going to be add every minute and this script to be run daily for classification into the dir structure, mencode them in to x264 and after zip the screenshots.

How could I achieve this using Python for example?

Cy.
  • 2,125
  • 4
  • 26
  • 35
  • "Files are going to be updated every minute". Will the files in the original directory be updated (in which case keep them in the original directory but just *copy* them to the new directory structure), or will the files be updated after being moved to their corresponding directories (in which case update only after the script is run once)? – pushkin Aug 02 '15 at 20:06
  • Sorry for now explaining properly. Files are going to be "Added" every minute. – Cy. Aug 02 '15 at 20:16
  • 1
    What have you tried so far? All you need to do is obtain the filenames in the directory, iterate over each one, use string slicing to get the year, month, and day, check if that sequence of directories exists (create it if not), then move the file to the appropriate directory and go to the next one. – MattDMo Aug 02 '15 at 20:22

2 Answers2

1

Here's a solution. A problem with it is that it assumes the directory structure exists (i.e. if it's moving an image to 2015/03/03 it doesn't check if that directory structure exists).

source_dir = r'C:\Users\test\Desktop\test'
target_dir = r'C:\Users\test\Desktop\test2'

def classify_images():
    import os

    def get_ymd(path, delim='_', delim2='-'):
        start_idx = path.find(delim)
        if start_idx < 0: return None

        stop_idx = path.find(delim, start_idx+1)
        if stop_idx < 0: return None

        ymd = path[start_idx+1:stop_idx]
        return ymd.split(delim2)

    if os.path.isdir(source_dir):
        dir_entries = os.listdir(source_dir)
        for entry in dir_entries:
            ymd = get_ymd(entry)
            if ymd is None or len(ymd) < 3:
                print 'Couldn\'t classify %s' % entry
                continue

            new_path = os.path.normpath(target_dir + r'\%s\%s\%s' % (ymd[0], ymd[1], ymd[2]))
            old_path = os.path.normpath(source_dir + '\\' + entry)

            # move file
            os.rename(old_path, new_path)
pushkin
  • 9,575
  • 15
  • 51
  • 95
1

You could do this fairly neatly in just shell, rather than needing python. This is kornshell:

ls -1tr | while read f
do
  if [[ -f $f && $f == image_*-*-*_* ]]
  then
    echo $f | ( IFS=_ read prefix_unwanted ymd rest_unwanted ; echo $ymd ) | IFS=- read y m d

    [[ -n $y && -n $m && -n $d ]] && mkdir -p $y/$m/$d && mv $f $y/$m/$d
  fi
done

So the use of IFS (field separator) splits the filename twice - once to get the year-mon-day out as one and then again to split that part.

mkdir -p only mks dir if it's not there, so that's fairly quick.

On bash the 2nd read will not work, so use the variable substitution ${ymd//-//}:

 echo $f | ( IFS=_ read prefix_unwanted ymd rest_unwanted ; mkdir -p ${ymd//-/\/} && mv $f ${ymd//-/\/} )

Bourne shell wouldn't manage the above - it can't do extended tests in [[ ]] with wildcard comparison.

The only thing that might be a problem is if whatever writes these jpgs opens and closes them and opens them again, because the mv will mv the inode and if it picks up a file that's still being written, that's only ok if you can be sure the writer does everything in one go and finishes (because the writer won't know the inode is moved.)

I'm assuming this is unix platform - it might not be so suitable if it's Windows.

Abe Crabtree
  • 524
  • 3
  • 9
  • This looks a pretty clean and neat solution indeed but when tried it get: script.sh: 3: [[: missing ]] – Cy. Aug 03 '15 at 19:39
  • That's kornshell above. It should also be fine on bash. You're using the old Bourne shell - don't use that, use either korn or bash. If you're on linux you should use bash (what system are you using?) The $f == image_*-*-*_* bit won't work on sh either. Shell is often neater, simpler and quicker for making/moving/deleting files and directories. – Abe Crabtree Aug 04 '15 at 07:22
  • I am using bash. I tried to exec it as "sh script.sh" or as ./script.sh without luck though. – Cy. Aug 04 '15 at 21:36
  • Don't use sh it won't do the [[ ]] form, which is needed for the wildcard comparison. bash will do that ok, but I think read is a bit different on bash from kornshell. I've just tried this on bash, and I don't quite understand bash's "read" limitation (plenty on the web, e.g. http://stackoverflow.com/questions/6883363/read-input-in-bash-inside-a-while-loop.) Anyway - it''ll work with a little change, I've altered the answer. – Abe Crabtree Aug 05 '15 at 07:27