1

I have done something in bash, the script takes up three file names and processes them and stores the final result in third file.

The script is:

#!/bin/bash

#clear

echo -n " Bam File "
read BamFile

echo -n " Region File "
read BedFile

echo -n " Output File "
read OutFile

awk '{print $1 "\t" $2 "\t" $3 "\t" $3-$2}' < $BedFile >Temp

coverageBed -abam $BamFile -b $BedFile -counts > bases

awk '{print $4 }' <bases >tempbases

paste -d "\t" Temp tempbases >TtTemp

samtools view -c -F 260 $BamFile > totalNumReads

cat totalNumReads | awk '{print $1}'>tags

tag=`cat tags`
echo " Number of tags present in file = $tag"

awk '{print $1 "\t" $2 "\t" $3 "\t" $4 "\t" $5 "\t" $5/($4/1000* "'$tag'"/1000000) } '<TtTemp > $OutFile

This script works well.

However, I would like to make the following adjustment to the script.

Instead of asking file names one by one,, I would like to provide them at the start

something like this:

process.bash -bam BamFile.bam -region RegFile -Out OutFile

where process.bash is my script and the three files are provided right at the start.

Could anyone please help me in doing this.

Thank you

Angelo
  • 4,829
  • 7
  • 35
  • 56
  • 1
    If you really want the option syntax, it's a little more complicated, but you can take an easier road: invoke the script like this `process.bash BamFile.bam RegFile OutFile` and inside the script refer to the filenames using _positional parameters_, i.e. `$1`, `$2` and `$3` – gboffi Jul 13 '17 at 10:54
  • 1
    If you proceed like this, remember to document the arguments that the script expects! – gboffi Jul 13 '17 at 10:55
  • here's an example with manual processing as well as getopts: http://mywiki.wooledge.org/BashFAQ/035 ... on SO: https://stackoverflow.com/questions/16483119/example-of-how-to-use-getopts-in-bash and another tutorial: http://wiki.bash-hackers.org/howto/getopts_tutorial – Sundeep Jul 13 '17 at 13:12
  • 1
    If you do choose to read input interactively, you would be better using `read -p "$prompt" var` instead of `echo -n "$prompt"; read var`. – Toby Speight Jul 13 '17 at 13:34

2 Answers2

2

You can test for the arguments like that:

#!/bin/bash

bam=null
reg=null
out=null

while [[ $# -gt 1 ]]; do
    arg="$1"
    case $arg in
        --bam)
            bam=$2
            shift
        ;;
        --reg)
            reg=$2
            shift
        ;;
        --out)
            out=$2
            shift
        ;;
        --help)
            helpmenu
        ;;
        *)
            shift
        ;;
    esac
    shift
done

function helpmenu() {
    echo -e "Your help text\n"
    exit 0
}

# Continue your script with the variables bam, reg and out
# ...

Then you can use your script like

$ process.bash --bam BamFile.bam --reg RegFile --out OutFile

That's it pretty much.

You can make stuff like help menus by using functions. For instance, I call the function helpmenu which is defined below. Then it just exists the script after echoing.

EDIT:

Since there has been a long discussion in the comments under this post let me make some stuff clear:

In my opinion, handling the arguments per hand like I did in my post is much more robust. The reason for that is, it supports a wide range of un*x systems (e.g. Non-POSIX Systems).

Then, because it was stated only long commands are supported: Short commands are supported as well. I just didn't add them to the code because it wasn't asked by the OP. For instance, if you want to be able to pass files like -b file as well as --bam file you just have to change the case statement accordingly:

-b|--bam)
    bam=$2
    shift
;;

I don't see anything wrong with this answer, as it provides the functionality asked. I go by this method myself in all of my scripts and never had problems doing so.

NullDev
  • 6,739
  • 4
  • 30
  • 54
  • Thank you, could you also put, an help option to see the kind of options that are needed for this file. Something like: $ process.sh --help and it shows that three files are required. – Angelo Jul 13 '17 at 11:05
  • Hi, unfortunately section of help is not working. If i put bash process.bash --help. It returns "Failed to open BAM file null". I will ofcourse accept the answer, you have already provided the solution for 98%. – Angelo Jul 13 '17 at 11:36
  • Change `helpmenu ;;` to `helpmenu; exit ;;` and don't exit from the function as you might want to do something after it in different situations but more importantly - don't parse arguments quite this way, look up how to use `getopts`. @Angelo - If you hadn't rushed to accept the first answer you got I'm sure someone would have provided a getopts answer for you or pointer you to one. – Ed Morton Jul 13 '17 at 11:54
  • @EdMorton What exactly is bad about doing it this way? Handling the arguments per hand like above is much more robust in my opinion, since it is supported by a wide range of un*x systems. – NullDev Jul 13 '17 at 12:02
  • 1
    Although the solution is ok, wouldn't it be better using the `getopts` command? – Cristian Ramon-Cortes Jul 13 '17 at 12:50
  • 1
    @EdMorton: Ok, Game on. I am open for more alternative solutions. Thank you :) – Angelo Jul 13 '17 at 13:04
  • @NullDev only long options are supported and it can't separate groups of characters into their options and provide standard UNIX argument handling. This job is why getopts exists. – Ed Morton Jul 13 '17 at 17:58
  • @EdMorton "only long options are supported" that is not true. You can just write `-b|--bam)` to handle `-b` as well as `--bam` – NullDev Jul 13 '17 at 19:08
  • @NullDev My comments are obviously not about some code you didn't write, they're about the code you did write in your answer. There's lots of different code you could add to the script in your answer to get it closer to the functionality that `getopts` provides for you (e.g. handling `-bfile` with no space or `--bam=file`) but there's no point avoiding simply using `getopts`. – Ed Morton Jul 13 '17 at 19:17
  • 1
    @NullDev @EdMorton Although I normally use the `getopts` solution I do find this solution useful. Tbh both solutions have pros and drawbacks (as stated in the edit and in the solution I posted) and I think any of us can choose the solution that fits most of its requirements. – Cristian Ramon-Cortes Jul 14 '17 at 07:22
2

Although looping over the arguments is also a good solution I'd like to provide a solution using the getopts command.

I use the internal getopts, not an extension, which has several limitations (i.e. you can only use single characters to refer to arguments).

Next I provide the most similar solution that I have found out.

#!/bin/bash

  ##############################
  # HELPER METHODS
  ##############################

  # Parses the script arguments
  getArgs() {
    # Parse Options
    while getopts :hvb:r:o:-: flag; do
      # Treat the argument
      case "$flag" in
        h)
          # Display help
          usage
          ;;
        v)
          # Display version
          show_version
          ;;
        b)
          bamFile=${OPTARG}
          ;;
        r)
          regFile=${OPTARG}
          ;;
        o)
          outFile=${OPTARG}
          ;;
        -)
          # Check more complex arguments of the form --OPT, --OPT=VALUE
          case "$OPTARG" in
            help)
              # Display help
              usage
              ;;
            version)
              show_version
              ;;
            bam=*)
              # Get bam filename
              bamFile=$(echo $OPTARG | sed -e 's/bam=//g')
              ;;
            reg=*)
              # Get bam filename
              regFile=$(echo $OPTARG | sed -e 's/reg=//g')
              ;;
            out=*)
              # Get bam filename
              outFile=$(echo $OPTARG | sed -e 's/out=//g')
              ;;
            *)
              # Flag didn't match any patern. Raise exception
              display_error "${OPTARG}"
              ;;
          esac
          ;;
        *)
          # Flag didn't match any patern. Raise exception
          display_error "${OPTARG}"
          ;;
      esac
    done
  }

  usage() {
    echo "Usage: "
    exit 0
  }

  show_version() {
    echo "Version: "
    exit 0
  }

  display_error() {
    local argument=$1
    echo "[ERROR] Bad argument $argument"
    exit 1
  }

  ##############################
  # MAIN PROCESS
  ##############################
  getArgs "$@"

  echo "[DEBUG] BAM $bamFile"
  echo "[DEBUG] REG $regFile"
  echo "[DEBUG] OUT $outFile"

  awk '{ print $1 "\t" $2 "\t" $3 "\t" $3-$2 }' < $bedFile > Temp
  coverageBed -abam $bamFile -b $bedFile -counts > bases

  awk '{print $4 }' < bases > tempbases
  paste -d "\t" Temp tempbases > TtTemp
  samtools view -c -F 260 $bamFile > totalNumReads

  cat totalNumReads | awk '{ print $1 }' > tags
  tag=$(cat tags)
  echo " Number of tags present in file = $tag"
  awk '{ print $1 "\t" $2 "\t" $3 "\t" $4 "\t" $5 "\t" $5/($4/1000* "'$tag'"/1000000) }' < TtTemp > $outFile

Some sample outputs:

$./process.sh -v
Version: 
$./process.sh --version
Version: 
$./process.sh -h
Usage: 
$./process.sh --help
Usage: 
$./process.sh -b bamfile -r regfile -o outfile
[DEBUG] BAM bamfile
[DEBUG] REG regfile
[DEBUG] OUT outfile
$./process.sh --bam=bamfile -rregfile --out=outfile
[DEBUG] BAM bamfile
[DEBUG] REG regfile
[DEBUG] OUT outfile

As I said, there are some limitations. For example:

$./process.sh --bam=bamfile -rregfile -out=outfile
[DEBUG] BAM bamfile
[DEBUG] REG regfile
[DEBUG] OUT ut=outfile

Is a valid entry although the user was trying to specify another thing. From my point of view, you should check the bamFile, regFile, outFile values after the parsing and before beginning the process.

Cristian Ramon-Cortes
  • 1,838
  • 1
  • 19
  • 32
  • Thank you :) It is a very neatly done work. I can learn a lot from this code and will apply it as well on many of my one liners in Unix that have lot of manual interventions. – Angelo Jul 14 '17 at 07:55