-1

Hi I know how to do do this when I have the full name of the files (usually I use rsync or a simple cp command) however I only have partial names.

Here is the simple script that I sometimes use for files in a list:

# Run pipeline
echo "Loading "
paste sample_id.txt |while IFS="$(printf '\t')" read -r sample;

do

cd $dirpath

cp -a source_directory/"$sample"*.fastq /output directory/;done

This seems to work fine when I have the complete file name but I'd like to be able to run this when I have a list containing partial file names.

For example I have directory that contains thousands of paired-end fastq files:

  • ABF-0123-FGHKL_1.fastq
  • ABF-0123-FGHKL_2.fastq
  • ABG-0567-G456_1.fastq
  • ABG-0567-G456_2.fastq

My sample id list contains:

  • ABF-0123
  • ABG-0567

The expected results:

  • ABF-0123-FGHKL_1.fastq
    • ABF-0123-FGHKL_2.fastq
    • ABG-0567-G456_1.fastq
    • ABG-0567-G456_2.fastq Thank you!
user3105519
  • 309
  • 4
  • 10
  • 1
    Welcome to Stack Overflow. SO is a question and answer page for professional and enthusiastic programmers. Add your own code to your question. You are expected to show at least the amount of research you have put into solving this question yourself. – Cyrus May 22 '20 at 11:31
  • https://stackoverflow.com/questions/61843060/find-thousands-of-files-efficiently-with-exact-match-from-a-directory-containing/61843372#61843372 Might be what you wanted. – Jetchisel May 22 '20 at 11:52
  • @Cyrus I added my code. – user3105519 May 22 '20 at 14:24
  • Using `find` you may find all files matched a pattern. Alternatively, you may output list of files with `ls` and search the list via `grep`/`awk`/... As for multiple patterns, you may either combine the into the one regular expression `(pattern1|pattern2)` or repeat searching for every pattern. Having the list of matching files you may do with them whatever you need, e.g. copy. What is a problem with that approach? – Tsyvarev May 22 '20 at 22:23

1 Answers1

1

You may use:

for sample in $(cat sample.txt); do
    find ${source_directory} ${sample}*.fastq -exec cp {} ${output_directory} \;
done
mtnezm
  • 1,009
  • 1
  • 7
  • 19