1

I have a folder of fastq file (genomic sequences) and an excel file with barcodes (series of 20 nucleotides) and I want to search all the barcodes in all fastq files and get the exact matches. I did "zgrep -u barcode file1 file2 file3" individually for few barcodes to test and it works but now I want to create a script that does it for me as I have around 200 different barcodes to look for in 10 files. I am not sure how can I incorporate zgrep into a script like this.

  • You did this in Powershell? Is Zgrep like a front end for just grep.exe? Seems like you're just calling on the exe itself, since it doesn't look like powershell syntax for `Get-Content`. You can stick to that syntax and throw it in a function and do one or multiple at a time with a `foreach` loop. I'd recommend going full powershell and using the excel module which contains `Import-Excel` which is better to work with, and more native to Powershell. – Abraham Zinala Jun 15 '21 at 22:30
  • Please read about [how to ask a good question](http://stackoverflow.com/help/how-to-ask).. – mklement0 Jun 15 '21 at 22:53
  • Hi! I am doing this in ssh using mobaxterm. I was basically calling the zgrep function. I have to stick to ssh as my files are on a remote server. Hope you can help. – INDERPREET SINGH Jun 16 '21 at 03:06

1 Answers1

0

Hello and welcome to stack overflow. I'm very sorry that some people here read over your non-IT background and certainly answer cryptically for you.

About your problem:

First, if possible, install the ImportExcel module on your machine using this PowerShell command:

Install-Module -Name ImportExcel -Scope CurrentUser -Force

After that we can run this small script to execute zgrep for each row in the Excel Document:

# Change this to the path to your file
$FilePath = "C:\Test123.xlsx"

$excelContent = Import-Excel -Path $FilePath
foreach($row in $excelContent)
{
    # Change columnName to the Name of the columne the barcodes are in
    zgrep -u $row.columnName file1 file2 file3
}

This should be all you need for your problem.

Dave
  • 86
  • 5