4

The current file organization looks like this:

Species_name1.asc
Species_name1.csv
Species_name1_Averages.csv
...
...
Species_name2.asc
Species_name2.csv
Species_name2_Averages.csv

I need to figure out a script that can create the new directories with the names (Species_name1, Species_name2... etc) and that can move the files from the base directory into the appropriate new directories.

import os
import glob
import shutil

base_directory = [CURRENT_WORKING_DIRECTORY]

with open("folder_names.txt", "r") as new_folders:
     for i in new_folders:
          os.mkdirs(base_directory+i)

Above is an example of what I can think of doing when creating new directories within the base directory.

I understand that I will have to utilize tools within the os, shutil, and/or glob modules if I were to use python. However, the exact script is escaping me and my files remain unorganized. If there is any advise you can provide in helping me complete this small task I will be most grateful.

Also there are many file types and suffixes within this directory but the (species_name?) portion is always consistent.

Below is the expected hierarchy:

Species_name1
-- Species_name1.asc
-- Species_name1.csv
-- Species_name1_Averages.csv
Species_name2
-- Species_name2.asc
-- Species_name2.csv
-- Species_name2_Averages.csv

Thank you in advance!

Mr. ReLu
  • 93
  • 8
  • Is a Perl solution acceptable ? – Gilles Quénot Jul 01 '20 at 21:37
  • Can you edit your post and add expected structure of dirs/files ? – Gilles Quénot Jul 01 '20 at 21:39
  • It is, however that requires me learning perl. Are there any perl friendly modules for python? thank you for your response! – Mr. ReLu Jul 01 '20 at 21:40
  • Shell will be sufficient. Please clarify your post – Gilles Quénot Jul 01 '20 at 21:41
  • Understood, I will edit it immediately showing the expected hierarchy. – Mr. ReLu Jul 01 '20 at 21:42
  • I believe you can do this with dictionaries, having the `key` a the folder name and the `value` as the list of files, you'd need to find the way to create this dictionary based on the patterns you want to split, a regex can easily do that with `findall()` and the following character. Then create a set out of it, where each value is the key, and then you can use `.startswith()` in a list comprehension to get all the file name, creating the dictionary I suggested in the beginning. Finally,with shutil create the directory with the key, and the values (files) to that directory. – Celius Stingher Jul 01 '20 at 21:42

2 Answers2

6

Like this using simple shell tools with :

find . -type f -name '*Species_name*' -exec bash -c '
    dir=$(grep -oP "Species_name\d+" <<< "$1")
    echo mkdir "$dir"
    echo mv "$1" "$dir"
' -- {} \; 

Drop the echo commands when the output looks good for you.

Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
  • 1
    Thank you very much I will attempt this on shell, and get back to you when finished. Thank you again! – Mr. ReLu Jul 01 '20 at 21:47
0

Assuming all your asc files are named like in your example:

from os import  mkdir
from shutil import move
from glob import glob

fs = []
for file in glob("*.asc"):
    f = file.split('.')[0]
    fs.append(f)
    mkdir(f)
    
for f in fs:
    for file in glob("*.*"):
        if file.startswith(f):
            move(file, f'.\\{f}\\{file}')


UPDATE:

Assuming all your Species_name.asc files are labeled like in your example:

from os import  mkdir
from shutil import move
from glob import glob

fs = [file.split('.')[0] for file in glob("Species_name*.asc")]
    
for f in fs:
    mkdir(f)
    for file in glob("*.*"):
        if file.startswith(f):
            move(file, f'.\\{f}\\{file}')
Red
  • 26,798
  • 7
  • 36
  • 58
  • No, the only reliable pattern is `Species_name` – Gilles Quénot Jul 01 '20 at 21:48
  • @GillesQuenot So we're supposed to only focus on `Species_name`? – Red Jul 01 '20 at 21:49
  • Yes, i'll fix it. – Red Jul 01 '20 at 21:51
  • 1
    I’m sorry for the confusion, but “species_name” is not the actual prefix to all the files. However, it is over 20,000 different species’ names (genus_species) and the .asc files are the only files that have this exclusively so the first answer was the best solution. Thank you again for both of your inputs! – Mr. ReLu Jul 01 '20 at 22:28