1

We have a file store set up for analysis files, and we've realized that we need to create a subdirectory structure within in. The dj.config settings are below, and the schema definition is

@schema
class AnalysisNwbfile(dj.Manual):
    definition = """
    # Table for holding the NWB files that contain results of analysis, such as spike sorting.
    analysis_file_name: varchar(255)               # name of the file
    ---
    -> Nwbfile                                     # name of the parent NWB file. Used for naming and metadata copy
    analysis_file_abs_path: filepath@analysis      # the full path to the file
    analysis_file_description = "": varchar(2000)  # an optional description of this analysis
    analysis_parameters = NULL: blob               # additional relevant parmeters. Currently used only for analyses
                                                   # that span multiple NWB files
    INDEX (analysis_file_abs_path)
    """

dj.config:

    dj.config['stores'] = {
    'raw': {
        'protocol': 'file',
        'location': str(raw_dir),
        'stage': str(raw_dir)
    },
    'analysis': {
        'protocol': 'file',
        'location': str(analysis_dir),
        'stage': str(analysis_dir)
    }
}

We currently have ~355k files in analysis_dir, and we'd like to move them to subdirectories to prevent filesystem problems. Is there any way to do that?

1 Answers1

0

The most straight forward solution is to change the configuration of the analysis store.

  1. move all the files from analysis_dir to analysis_dir/subdir
  2. redefine analysis store in dj.config
dj.config['stores'] = {
    'raw': {
        'protocol': 'file',
        'location': str(raw_dir),
        'stage': str(raw_dir)
    },
    'analysis': {
        'protocol': 'file',
        'location': str(analysis_dir) + '/subdir',
        'stage': str(analysis_dir) + '/subdir'
    }
}

But this won't work if you have other attributes (in other schemas/tables) that are also filepath@analysis but relies on the previous configuration of analysis store (i.e. str(analysis_dir)). If this is the case, we can discuss another solution.

Or is it the case that you want to move the ~355k files into multiple different subdirectories under analysis_dir - e.g. some in analysis_dir/sub1, analysis_dir/sub2, etc. If so, this is complicates things quite a bit and may requires row by row update (using datajoint .update1() method)

  • That's very helpful; thanks. And indeed we'd like to move things to a separate set of directories based on filenames (analysis_dir/sub1, analysis_dir/sub2). We don't have outside dependencies, so I think the update1() should work there. We'll give it a try... – Loren Frank Jul 29 '22 at 18:50