-2

I want to merge paired files that share similar start of the filesname. The output should be a unique name found in the corresponding input files. I am not sure how do do this, however, cat would do somehow.

107_MAE_E7_S11_L001_R1_001.fastq.gz
107_MAE_E7_S11_L002_R1_001.fastq.gz
108_IME_A8_S23_L001_R1_001.fastq.gz
108_IME_A8_S23_L002_R1_001.fastq.gz

out

107_MAE_E7_S11.fastq.gz
108_IME_A8_S23.fastq.gz
user2300940
  • 2,355
  • 1
  • 22
  • 35
  • 1
    What do you mean by "share" ? They all start by `1`... So you could merge them all. Plus, you should start by providing your try code... :) – Amessihel Feb 09 '16 at 10:16
  • I mean pairs if files. file1 and 2, file3 and 4 etc. – user2300940 Feb 09 '16 at 10:22
  • Thanks for your edit. If I understand : you want to merge file1 ans 2 by the longuest prefix they share, if they share one ? – Amessihel Feb 09 '16 at 10:25
  • Have a look [here](http://stackoverflow.com/questions/6973088/longest-common-prefix-of-two-strings-in-bash). Then, please start by providing a piece of your own code. – Amessihel Feb 09 '16 at 10:31
  • 1
    Please edit your question and add the code you've already written. The StackOverflow community is only to happy to help you improve your code, but we are not short order programmers working for free. – ghoti Feb 09 '16 at 14:26

1 Answers1

0

From your example (which is not clear if it is representative of all filenames), you can just cut out the middle of the file, and use uniq. What you key off of depends on whether all files really look like what you have above.

Example:

# cut -c1-14,27- myfilename | uniq
107_MAE_E7_S11.fastq.gz
108_IME_A8_S23.fastq.gz
Brian
  • 2,172
  • 14
  • 24