0

I'm processing a large collection of born-digital materials for an archive but I'm being slowed down by the fact that I'm having to manually create directories and find and move files from multiple directories into newly created directories.

Problem: I have three directories containing three different types of content derived from different sources:

-disk_images -evidence_photos -document_scans

The disk images were created from CDs that come with cases and writing on the cases that need to be accessible and preserved for posterity so pictures have been taken of them and loaded into the evidence photos folder with a prefix and inventory number. Some CDs came with indexes on paper and have been scanned and OCR'd and loaded into the document scan folder with a prefix and an inventory number. Not all disk images have corresponding photos or scans so the inventory numbers in those folders are not linear.

I've been trying to think of ways to write a script that would look through each of these directories and move files with the same suffix (not extension) to newly created directories for each inventory number but his is way beyond my expertise. Any help would be much appreciated and I will be more than happy to clarify if need be.

examples of file names: -disk_images/ahacd_001.iso
-evidence_photos/ahacd_case_001.jpg -document_scans/ahacd_notes_001.pdf

Potential new directory name= ahacd_001

There all files with inventory number 001 would need to end up in ahacd_001 Bold= inventory number

warde
  • 3
  • 2

2 Answers2

0

Here is a squeleton of program to iterate through your 3 starting folders and split your file names:

for folder in `ls -d */` #list directories 
do 
  echo "moving folder $folder"
  ls $folder | while read file # list the files in the directory
  do
    echo $file
    # split the file name with awk and get the first part ( 'ahacd' ) and the last ('002')
    echo $file | awk -F '.' '{print $1}' |awk -F '_' '{print $1 "_" $NF}' 

    # when you are statisfied that your file splitting works...
    mkdir folder # create your folder
    move file # move the file
  done
done

A few pointers to split the filenames : Get last field using awk substr

Gorille
  • 170
  • 12
  • [Shellcheck](https://www.shellcheck.net/) identifies several problems with the code. – pjh Feb 21 '19 at 19:51
0

First I would like to say that file or directory names starting with - is a bad idea even if it's allowed.

Test case:

mkdir -p /tmp/test/{-disk_images,-evidence_photos,-document_scans}
cd /tmp/test
touch -- "-disk_images/ahacd_001.iso"       #create your three test files
touch -- "-evidence_photos/ahacd_case_001.jpg"
touch -- "-document_scans/ahacd_notes_001.pdf"
find -type f|perl -nlE \
'm{.*/(.*?)_(.*_)?(\d+)\.}&&say qq(mkdir -p target/$1_$3; mv "$_" target/$1_$3)'

...will not move the files, it just shows you what commands it thinks should be runned.

If those commands is what you want to be runned, then run them by adding |bash at the end of the same find|perl command:

find -type f|perl -nlE \
'm{.*/(.*?)_(.*_)?(\d+)\.}&&say qq(mkdir -p target/$1_$3; mv "$_" target/$1_$3)' \
| bash

find -ls   #to see the result

All three files are now in the target/ahacd_001/ subfolder.

Kjetil S.
  • 3,468
  • 20
  • 22