32

I have several folders, each with between 15,000 and 40,000 photos. I want each of these to be split into sub folders - each with 2,000 files in them.

What is a quick way to do this that will create each folder I need on the go and move all the files?

Currently I can only find how to move the first x items in a folder into a pre-existing directory. In order to use this on a folder with 20,000 items... I would need to create 10 folders manually, and run the command 10 times.

ls -1  |  sort -n | head -2000| xargs -i mv "{}" /folder/

I tried putting it in a for-loop, but am having trouble getting it to make folders properly with mkdir. Even after I get around that, I need the program to only create folders for every 20th file (start of a new group). It wants to make a new folder for each file.

So... how can I easily move a large number of files into folders of an arbitrary number of files in each one?

Any help would be very... well... helpful!

Brian C
  • 1,333
  • 3
  • 19
  • 36

9 Answers9

46

Try something like this:

for i in `seq 1 20`; do mkdir -p "folder$i"; find . -type f -maxdepth 1 | head -n 2000 | xargs -i mv "{}" "folder$i"; done

Full script version:

#!/bin/bash

dir_size=2000
dir_name="folder"
n=$((`find . -maxdepth 1 -type f | wc -l`/$dir_size+1))
for i in `seq 1 $n`;
do
    mkdir -p "$dir_name$i";
    find . -maxdepth 1 -type f | head -n $dir_size | xargs -i mv "{}" "$dir_name$i"
done

For dummies:

  1. create a new file: vim split_files.sh
  2. update the dir_size and dir_name values to match your desires
    • note that the dir_name will have a number appended
  3. navigate into the desired folder: cd my_folder
  4. run the script: sh ../split_files.sh
LinusGeffarth
  • 27,197
  • 29
  • 120
  • 174
tmp
  • 1,079
  • 9
  • 16
  • Thanks for the answer. I also needed to read and write the data from and to subfolders called irr. So mine look like this: `for i in \`seq 1 10\`; do mkdir -p "folder$i/irr"; find . -maxdepth 2 -type f | head -n 4000 | xargs -i mv "{}" "folder$i/irr"; done` – cgl Nov 14 '15 at 12:28
  • 1
    for mac os there is a tweak needed for xargs `find . -maxdepth 1 -type f | head -n $dir_size |xargs -J {} mv {} "$dir_name$i"` – CharlesC Apr 21 '19 at 16:10
  • Getting error: xargs: illegal option -- i – Gaurav Agrawal Apr 14 '23 at 05:38
23

This solution worked for me on MacOS:

i=0; for f in *; do d=dir_$(printf %03d $((i/100+1))); mkdir -p $d; mv "$f" $d; let i++; done

It creates subfolders of 100 elements each.

Giovanni Benussi
  • 3,102
  • 2
  • 28
  • 30
8

This solution can handle names with whitespace and wildcards and can be easily extended to support less straightforward tree structures. It will look for files in all direct subdirectories of the working directory and sort them into new subdirectories of those. New directories will be named 0, 1, etc.:

#!/bin/bash

maxfilesperdir=20

# loop through all top level directories:
while IFS= read -r -d $'\0' topleveldir
do
        # enter top level subdirectory:
        cd "$topleveldir"

        declare -i filecount=0 # number of moved files per dir
        declare -i dircount=0  # number of subdirs created per top level dir

        # loop through all files in that directory and below
        while IFS= read -r -d $'\0' filename
        do
                # whenever file counter is 0, make a new dir:
                if [ "$filecount" -eq 0 ]
                then
                        mkdir "$dircount"
                fi

                # move the file into the current dir:
                mv "$filename" "${dircount}/"
                filecount+=1

                # whenever our file counter reaches its maximum, reset it, and
                # increase dir counter:
                if [ "$filecount" -ge "$maxfilesperdir" ]
                then
                        dircount+=1
                        filecount=0
                fi
        done < <(find -type f -print0)

        # go back to top level:
        cd ..
done < <(find -mindepth 1 -maxdepth 1 -type d -print0)

The find -print0/read combination with process substitution has been stolen from another question.

It should be noted that simple globbing can handle all kinds of strange directory and file names as well. It is however not easily extensible for multiple levels of directories.

Community
  • 1
  • 1
Michael Jaros
  • 4,586
  • 1
  • 22
  • 39
6

The code below assumes that the filenames do not contain linefeeds, spaces, tabs, single quotes, double quotes, or backslashes, and that filenames do not start with a dash. It also assumes that IFS has not been changed, because it uses while read instead of while IFS= read, and because variables are not quoted. Add setopt shwordsplit in Zsh.

i=1;while read l;do mkdir $i;mv $l $((i++));done< <(ls|xargs -n2000)

The code below assumes that filenames do not contain linefeeds and that they do not start with a dash. -n2000 takes 2000 arguments at a time and {#} is the sequence number of the job. Replace {#} with '{=$_=sprintf("%04d",$job->seq())=}' to pad numbers to four digits.

ls|parallel -n2000 mkdir {#}\;mv {} {#}

The command below assumes that filenames do not contain linefeeds. It uses the implementation of rename by Aristotle Pagaltzis which is the rename formula in Homebrew, where -p is needed to create directories, where --stdin is needed to get paths from STDIN, and where $N is the number of the file. In other implementations you can use $. or ++$::i instead of $N.

ls|rename --stdin -p 's,^,1+int(($N-1)/2000)."/",e'
nisetama
  • 7,764
  • 1
  • 34
  • 21
  • 3
    Using `ls` comes with a lot of problems. I adapted your example a little bit: `find . -type f -print0|parallel -0 -n2000 mkdir "dir_{#}"\;mv {} "dir_{#}"`. Works equally fine for me. – Pascal Apr 28 '18 at 20:52
  • @Pascal No it doesn't, unless you have filenames with linefeeds, or you have a weirdo version of `ls` (like busybox `ls`) which replaces characters in filenames with question marks even when the output is not to a terminal. – nisetama Mar 28 '19 at 03:34
4

I would go with something like this:

#!/bin/bash
# outnum generates the name of the output directory
outnum=1
# n is the number of files we have moved
n=0

# Go through all JPG files in the current directory
for f in *.jpg; do
   # Create new output directory if first of new batch of 2000
   if [ $n -eq 0 ]; then
      outdir=folder$outnum
      mkdir $outdir
      ((outnum++))
   fi
   # Move the file to the new subdirectory
   mv "$f" "$outdir"

   # Count how many we have moved to there
   ((n++))

   # Start a new output directory if we have sent 2000
   [ $n -eq 2000 ] && n=0
done
Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • Shouldn't `$n -eq 2000` be `$n -eq 1999` since start is zero-based? Regardless, I like this solution, thanks. – ksclarke May 01 '19 at 18:51
2

The answer above is very useful, but there is a very import point in Mac(10.13.6) terminal. Because xargs "-i" argument is not available, I have change the command from above to below.

ls -1 | sort -n | head -2000| xargs -I '{}' mv {} /folder/

Then, I use the below shell script(reference tmp's answer)

#!/bin/bash

dir_size=500
dir_name="folder"
n=$((`find . -maxdepth 1 -type f | wc -l`/$dir_size+1))
for i in `seq 1 $n`;
do
    mkdir -p "$dir_name$i";
    find . -maxdepth 1 -type f | head -n $dir_size | xargs -I '{}' mv {} "$dir_name$i"
done
冯剑龙
  • 569
  • 8
  • 22
2

This is a tweak of Mark Setchell's

Usage:

bash splitfiles.bash $PWD/directoryoffiles splitsize

It doesn't require the script to be located in the same dir as the files for splitting, it will operate on all files, not just the .jpg and allows you to specify the split size as an argument.

#!/bin/bash
# outnum generates the name of the output directory
outnum=1
# n is the number of files we have moved
n=0

if [ "$#" -ne 2 ]; then
    echo Wrong number of args
    echo Usage: bash splitfiles.bash $PWD/directoryoffiles splitsize
    exit 1
fi

# Go through all files in the specified directory
for f in $1/*; do
   # Create new output directory if first of new batch
   if [ $n -eq 0 ]; then
      outdir=$1/$outnum
      mkdir $outdir
      ((outnum++))
   fi
   # Move the file to the new subdirectory
   mv "$f" "$outdir"

   # Count how many we have moved to there
   ((n++))

   # Start a new output directory if current new dir is full
   [ $n -eq $2 ] && n=0
done
ezekiel
  • 427
  • 5
  • 20
  • Create all directories upfront, you know how many you need. Saves you from having to do that check in the loop... Maybe insignificant compared to the filesystem move operations though. Upvoted. – MrR Jun 21 '22 at 21:46
  • This is the one that worked flawlessly for me. Thanks! – Denis.Kipchakbaev Nov 23 '22 at 10:52
1

Can be directly run in the terminal

i=0; 
for f in *; 
do 
    d=picture_$(printf %03d $((i/2000+1))); 
    mkdir -p $d; 
    mv "$f" $d; 
    let i++; 
done

This script will move all files within the current directory into picture_001, picture_002... and so on. Each newly created folder will contain 2000 files

  • 2000 is the chunked number
  • %03d is the suffix digit you can adjust (currently 001,002,003)
  • picture_ is the folder prefix
  • This script will chunk all files into its directory (create subdirectory)
ZenithS
  • 987
  • 8
  • 20
0

You'll certainly have to write a script for that. Hints of things to include in your script:

First count the number of files within your source directory

NBFiles=$(find . -type f -name *.jpg | wc -l)

Divide this count by 2000 and add 1, to determine number of directories to create

NBDIR=$(( $NBFILES / 2000 + 1 ))

Finally loop through your files and move them accross the subdirs. You'll have to use two imbricated loops : one to pick and create the destination directory, the other to move 2000 files in this subdir, then create next subdir and move the next 2000 files to the new one, etc...

jderefinko
  • 647
  • 4
  • 6