0

I've been trying to write a little script to sort image files in my Linux server. I tried multiple solution found all over StackExchange but it never meets my requirements.

Explanation:

photo_folder are filled with images (various extensions). Mostly, images are already in this folder. But sometime, like the example below, images are hidden in one or multiple photo_subfolder and file names are often the same such as 1.jpg, 2.jpg... in each of them.

Basically, I would like to move all image files from photo_subfolder to their photo_folder and all duplicated filenames to be renamed before merging together.

Example:

|parent_folder
|    |photo_folder
|    |    |photo_subfolder1
|    |    |    1.jpg
|    |    |    2.jpg
|    |    |    3.jpg
|    |    |photo_subfolder2
|    |    |    1.jpg
|    |    |    2.jpg
|    |    |    3.jpg
|    |    |photo_subfolder3
|    |    |    1.jpg
|    |    |    2.jpg
|    |    |    3.jpg

Expectation:

|parent_folder
|    |photo_folder
|    |    1_a.jpg
|    |    2_a.jpg
|    |    3_a.jpg
|    |    1_b.jpg
|    |    2_b.jpg
|    |    3_b.jpg
|    |    1_c.jpg
|    |    2_c.jpg
|    |    3_c.jpg

Note that files names are just an example. Could be anything.

Thank you!

ben
  • 31
  • 4
  • You can use `rename` for that, example here https://stackoverflow.com/a/62720198/2836621 – Mark Setchell Nov 28 '22 at 08:07
  • Further example https://stackoverflow.com/a/54817709/2836621 – Mark Setchell Nov 28 '22 at 08:11
  • Hi Mark, thanks for your reply. I am already using rename in some of my bash scripts actually. But It's not working in that case. If you read my post a second time, it's more complicated than just batch renaming files. – ben Nov 28 '22 at 08:25
  • You are basically collapsing the directory name `photoset1` or `photoset2` or whatever into `a`, `b` or whatever. So if you appended the directory name to a list (without duplicates), you could use the index into the list instead of `a` or `b`. – Mark Setchell Nov 28 '22 at 08:43

1 Answers1

0

You can replace the / of the subdirectories with another character, e.g. _ , and then cp/mv the original file to the parent directory.

I try to recreate an example of your directory tree here - very simple, but I hope it can be adapted to your case. Note that I am using bash.

#!/bin/bash

bd=parent
mkdir ${bd}

for i in $(seq 3); do
  mkdir -p "${bd}/photoset_${i}/subset_${i}"

  for j in $(seq 5); do

    touch "${bd}/photoset_${i}/${j}.jpg"
    touch "${bd}/photoset_${i}/${j}.png"
    touch "${bd}/photoset_${i}/subset_${i}/${j}.jpg"
    touch "${bd}/photoset_${i}/subset_${i}/${j}.gif"

  done
done

Here is the script that will cp the files from the subdirectories to the parent directory. Basically

  1. find all the files recursively in the subdirectories and loop on them
  2. use sed to replace \ with '_' and store this in a variable new_filepath (I also remove the initial parent_, but this is optional)
  3. copy (or move) the old filepath into parent with filename new_filepath
for xtension in jpg png gif; do
  while IFS= read -r -d '' filepath; do

    new_filepath=$(echo "${filepath}" | sed s@/@_@g)
    cp "${filepath}" "${bd}/${new_filepath}"

  done < <(find ${bd} -type f -name "*${xtension}" -print0)
done

ls ${bd}

If you want to remove also the additional parent_ from the new_filepath you can replace the new_filepath above with:

new_filepath=$(echo ${filepath} | sed s@/@_@g | sed s/${bd}_//g)

I assumed that you define all the possible extension in the script. Otherwise to find all the extensions in the directory tree you can use the following snippet from a previous answer

find . -type f -name '*.*' | sed 's|.*\.||' | sort -u

LeonaRdo
  • 124
  • 1
  • 9
  • 1
    `for f in $(find …)` is an anti-pattern in bash. It is just as bad as `ls` in this context. See [Why you shouldn't parse the output of ls(1)](https://mywiki.wooledge.org/ParsingLs). Also, `new_filepath=$(echo ${filepath} | sed s@/@_@g)` is very inefficient. Simply `new_filepath=${filepath//'/'/_}` should do the trick. – M. Nejat Aydin Nov 28 '22 at 12:31
  • @M.NejatAydin Thank you for pointing out the problems with the `for f in $(find ...)`. I edited the solution accordingly. Instead I left the `sed s@/@_@g)` because the alternative version `new_filepath=${filepath////_}` is very difficult to read. Readability is also very important. – LeonaRdo Nov 28 '22 at 16:01
  • Hi LeonaRdo, thanks for your reply. That's complicated because I don't know how many subsets are in each folder. It's always a different pattern. (I have to sort 2000 photo folders with potential subsets inside). Each subfolder set (if any) has to be renamed differently before all files to be merged to the parent folder. – ben Nov 29 '22 at 06:38
  • hello @ben. If I understand correctly your situation it shouldn't be a problem. (1) The directory tree in my example is simple, however`find` will travel the whole directory tree and look for files with the given extensions. (2) Since the new filename will reflect the initial location of the file, all the new filenames will be different, and you will know their origin. Try to run the second script (after assigning parent `bd=parent`) prepending an `echo` to the `cp` command to see how it will look, so just replace the `cp` line with `echo cp "${filepath}" "${bd}/${new_filepath}"` – LeonaRdo Nov 29 '22 at 09:21
  • Thanks LeonaRdo. The script works but all pictures in subfolders are copied to the folder below their main photoset folders. Also, filenames are extremely long. I think it's gonna be an issue. – ben Dec 01 '22 at 20:17
  • hello @ben. If you run the whole example in one script you will see that all the files are copied in the `parent` dir, so I advise to double check your version of the script. For the long file names, that is just one possibility that I came up with to automatically rename the files in a unique way. Maybe you can try the solutions proposed above by @Mark Setchell. – LeonaRdo Dec 01 '22 at 20:40
  • I didn't edit anything. Just added `bd=/home/mysite/public_html/ftp-album-sorting` But what do you mean with the parent dir? The photoset folder itself? touch `"${bd}/photoset_${i}/subset_${i}/${j}.jpg"` Because `{bd}` is right before photoset folder in your tree. `photoset` is actually where I want my photo files to be merged, from `subset`. – ben Dec 01 '22 at 21:18
  • @LeonaRdo I modified my original post to make it more clear. My mistake. – ben Dec 01 '22 at 22:53
  • @ben the directory I indicated as `bd=parent` (base directory) is the one which (1) contains all the subdirectories where the pics are, and (2) where you want all the renamed pics to end up into. So in your case it is probably `bd=photo_folder`. The `parent_folder` should contain both the `photo_folder` and the script to be run – LeonaRdo Dec 02 '22 at 04:20
  • @LeonaRdo I would like to set the path as `bd=photo_folder` but all folders have different names. I tried this bd=/home/mysite/public_html/ftp-album-sorting/* but it's not working – ben Dec 02 '22 at 11:01
  • let's say that your directory containing all the subsets with the pics is called `allpics`, and is a subdirectory of `mydir` (so `mydir/allpics`). Then in terminal you go to `mydir`, set `bd=allpics` and then run the `for xtension...` command above – LeonaRdo Dec 02 '22 at 11:33
  • In my first post, my example shows only 1 photoset with subsets. Actually, I've got more than 10.000 photosets containing their own subsets. If I do what you just mentioned, I need to change `bd=mydir/photoset` 10.000 times to process each photoset one by one. `mydir/allpics` is relative. – ben Dec 02 '22 at 19:35
  • And the script works well actually. But it has to be run for each photoset individually. – ben Dec 02 '22 at 19:45
  • Hi @ben. Your collection of images looks quite massive and it's difficult to understand how it is organized and distributed. I would advise to look for professional consulting at this point. – LeonaRdo Dec 04 '22 at 18:22
  • Hi @LeonaRdo, If you look at my example, I've got more than one `photo_folder` (10.000). The script works only to process every `photo_folder` by specifying it: `bd=photo_folder`. I am just looking to get the script to process in a loop all the photo_folder. – ben Dec 06 '22 at 12:36
  • If each photo_folder is identified by a unique name, you can move everything inside a common directory and run the command on top of that. If there are repetitions in the names of the photo_folders, you need to solve that first – LeonaRdo Dec 06 '22 at 16:23