0

I am dealing with the design of bash routine which consist of several functions designed to proces step-by-step input data

The first function sorts lines in each of processed CSVs based on the numbers either in the column 2 or column3:

# function 1: sort data using either function 1 or function 2
sort_data () {
   for csv in "${rescore}"/${str_name}/*.csv; do
    csv_name=$(basename "$csv" .csv)
    # A) sorting based on the first column :
    #sort -k1.4,1n ${csv} > "${rescore}"/${str_name}/${csv_name}_std.csv
    # B) sorting based on the second column (dG):
    LC_ALL=C sort -k2,2g ${csv} > "${rescore}"/${str_name}/${csv_name}_std.csv
    rm $csv
   done
}

In this case sorting method A is disabled (comented) manually, so the script uses method B. I need to add a condition in this function that would switch beetween A and B depending on the option defined during execution of my bash script in the shell.

The same logic for the second function (in the same bash script), which merge several input CSV files and then remove repeated columns in the merged file

# function 2: fuse several files and remove repeated columns via AWK if it's nessesary
table_fuse () {   
    paste -d'\t' "${rescore}"/${str_name}/*.csv >> "${rescore}"/${str_name}/${dataset}_${str_name}.csv | column -t -s$'\t'
   ## remove repeated columns: only to activate with sorting method A
   #awk '{first=$1;gsub(/[Ll]ig([0-9]+)?(\([-azA-Z]+\))?/,"");print first,$0}' "${rescore}"/${dataset}_${str_name}.csv > "${rescore}"/${dataset}_${str_name}_fin.csv 
}

In this case the AWK part of the function 2 is disabled and I need to activate it or disable depending on the same option defined for the first function during execution of my bash script.

Consequently, while executing my bash script with the option --sort=1 the sorting method A should be used (uncomented) for the first function and the AWK command in the second function should be uncomented (enabled); and with --sort--2, the soring method B should be selected (in function 1) and the AWK command (in function 2) should be comented (disabled)

  • Did you ask for how to parse bash script arguments or how to use [parametrized function](https://stackoverflow.com/questions/6212219/passing-parameters-to-a-bash-function) ? – Zilog80 Apr 22 '21 at 11:21
  • I suppose it rather related to the parsing of the bash script argument (someone has suggested CASE) rather then parametrization of these function... The only thing that I need to understand how to assosiate an option (provided with the bash script) to the variable (used inside of this script), which should be placed in some of the IF or CASE statement near sort (function 1) or AWK (function 2) ... –  Apr 22 '21 at 13:47

2 Answers2

0

You can use a case statement, or a combination of case statement and getopt to pass an option via command line for the function1, while once the option variable is assigned you can simply check either is set or not for the awk line of function2.

  • Thank you! I have not found however how to postpone an option for CASE directly from the terminal (during execution of the script.sh) e.g. according https://linuxize.com/post/bash-case-statement/. could you show me a brief example e.g. for function 1 ? Thanks in advance –  Apr 22 '21 at 13:46
  • Hi, what are you exactly trying to "accomplish" when you say "postpone an option for CASE"? Just asking to clarify your needs and trying to answer if I'm able to do it – MonsieurMemons Apr 22 '21 at 13:53
  • Well the idea is to assosiate an option (provided during execution of my bash script e.g bash --sort=1) to the variable which is going to be used inside of this script inside some condition ( IF or CASE statement) i nroder to triger the selection of posibilities for sort (in the function 1) or activation/desaction of AWK (in the function 2) ... IF sort==1: activate A; else: activate B; –  Apr 22 '21 at 13:58
  • I'm still not sure about the "postpone" thing though, the general idea of how you want the function and the script to work once the parameter is passed is fairly clear – MonsieurMemons Apr 22 '21 at 14:02
  • I just read your reply on the other comment, and in case you are wondering how to associate the option to the variable, this is how it looks with getopt: `while getopts 'udt' OPT; do case $OPT in u) var1=$OPTARG ;; d) var2=$OPTARG ;; t) var3=$OPTARG;; done shift $(($OPTIND - 1))` – MonsieurMemons Apr 22 '21 at 14:07
  • and all of these var1-3 should be provided during execution of the script in shell ? –  Apr 23 '21 at 12:57
  • That's up to how you write the rest of the script, is not up to getopt, this was an example to show how to associate a variable to a getopt option. – MonsieurMemons Apr 23 '21 at 15:06
0

Parsing arguments in a shell script could be done in multiple ways using the parametrized arguments $1, $2, .. $n and $# with the shift command.

In your case, you can use a function parse_args() called at the main entry point of your script with parse_args $* like :

parse_args() { 
  # Without args, sorting method is 1 by default
  MY_SORT_METHOD=1
  while [ $# -gt 0 ] ; do
    MY_ARG="$1"
    if [ "${MY_ARG:0:6}" = "--sort" ]; then
      #  sort method argument
      MY_ARG_VALUE="${MY_ARG:6:2}"
      # Check the sort method supplied
      if [ "${MY_ARG_VALUE}" != "=1" -a "${MY_ARG_VALUE}" != "=2" ]; then
         echo "Bad sort method supplied : ${MY_ARG_VALUE}"
         return 1;
      fi
      MY_SORT_METHOD="${MY_ARG_VALUE:1:1}"
    fi
    shift # Shift to next parameters if any
  done
  return 0;
}

You can check the exit status code of parse_args to ensure correct parameters were supplied (if not 0, then stop processing script in general).

EDIT: How to use it in your script : EDIT2: Adding a display of arguments parsing results :

# Parsing script arguments
parse_args() { 
  # Without args, sorting method is 1 by default
  MY_SORT_METHOD=1
  while [ $# -gt 0 ] ; do
    MY_ARG="$1"
    if [ "${MY_ARG:0:6}" = "--sort" ]; then
      #  sort method argument
      MY_ARG_VALUE="${MY_ARG:6:2}"
      # Check the sort method supplied
      if [ "${MY_ARG_VALUE}" != "=1" -a "${MY_ARG_VALUE}" != "=2" ]; then
         echo "Bad sort method supplied : ${MY_ARG_VALUE}"
         return 1;
      fi
      MY_SORT_METHOD="${MY_ARG_VALUE:1:1}"
    fi
    shift # Shift to next parameters if any
  done
  return 0;
}
# function displaying the current execution configuration for the script
display_config() {
  echo "Provided command line arguments : $*"
  echo "Current sort method : ${MY_SORT_METHOD}"
}
# function 1: sort data using either function 1 or function 2
sort_data () {
   for csv in "${rescore}"/${str_name}/*.csv; do
    csv_name=$(basename "$csv" .csv)
    # A) sorting based on the first column :
    [ "${MY_SORT_METHOD}" = "1" ] && sort -k1.4,1n ${csv} > "${rescore}"/${str_name}/${csv_name}_std.csv
    # B) sorting based on the second column (dG):
    [ "${MY_SORT_METHOD}" = "2" ] && LC_ALL=C sort -k2,2g ${csv} > "${rescore}"/${str_name}/${csv_name}_std.csv
    rm $csv
   done
}
# function 2: fuse several files and remove repeated columns via AWK if it's nessesary
table_fuse () {   
    paste -d'\t' "${rescore}"/${str_name}/*.csv >> "${rescore}"/${str_name}/${dataset}_${str_name}.csv | column -t -s$'\t'
   ## remove repeated columns: only to activate with sorting method A
   [ "${MY_SORT_METHOD}" = "1" ] && awk '{first=$1;gsub(/[Ll]ig([0-9]+)?(\([-azA-Z]+\))?/,"");print first,$0}' "${rescore}"/${dataset}_${str_name}.csv > "${rescore}"/${dataset}_${str_name}_fin.csv 
}
#
# Script main entry point
#
parse_args $*
if [ $? != 0 ]; then 
  exit 1; # Incorrect arguments
fi
# Display script current configuration after arguments parsing
display_config $*

...
<call to your methods sort_data and table_fuse>
...
Zilog80
  • 2,534
  • 2
  • 15
  • 20
  • Many thanks !! One question: Is it possible to assign these variables during the execution of the bash script or I did not understand properly the philosopy ?? could you specify how this function could be applied directly on my case (e.g. for the switching between A or B scoring options) ? –  Apr 23 '21 at 12:56
  • @HotJAMS It is possible to use $1..$N, $#, $* in the body of your script. Take care that in functions they are associated to the parameters of the function, not the parameters of the script. I've edited the answer to give you an example usage for your script. – Zilog80 Apr 23 '21 at 13:07
  • OK, thank you so much! now technically the idea has become clear (the default sorting method=1, is also great!). The only question, how I can define manually (set one of the argument defined in the parse_arg) in order to be able to swith between $1 and $2 while executing my script.sh in the terminal ?? Need I to postpone some arguments directly in the terminal like script.sh -sort 1 ? Would it be possible if the parse_args() echoes which sorting method is being used upon the execution of the script in order to controll better the flow? –  Apr 27 '21 at 10:35
  • 1
    @HotJAMS I've added a configuration display to the answer. You can switch between sort method 1 and 2 during script execution by simply set the value of `MY_SORT_METHOD`. For example, you can set it at some point of your script to method 2 with `MY_SORT_METHOD=2` or invert from 1 to 2 or 2 to 1 with `[ "${MY_SORT_METHOD}" = "1" ] && MY_SORT_METHOD=2 || MY_SORT_METHOD=1;`. – Zilog80 Apr 27 '21 at 11:27
  • Right, thank you very much again! In fact I just modified MY_SORT_METHOD=1 inside the parse_args () function to switch beetween both sorting strategies. I only could not understand echo "Provided command line arguments : $*". what is the command line arguments and how they can be used (supposing that I don't provide any of them during script execution) ?? may I assosite MY_SORT_METHOD with any of the command line argument to automatize it abit more (actually using this examples I just to know understand the versality of bash, how I can define variables outside of the script etc...) Cheers –  Apr 27 '21 at 14:12
  • I think I found it, in order to select between the both sorting method It was nessesary just to add ./script.sh --sort=1 or --sort=2 in the command line, that was I did not understand properly. So thank you very much again !!! –  Apr 27 '21 at 14:22
  • 1
    @HotJAMS You're welcome. Regarding shell script arguments, you'll find here many answers, and you should give a shot to the bash man page on the subject ^^. – Zilog80 Apr 27 '21 at 14:25
  • OK!! just one question - for example I need to introduce a second argument (e.g. --score={1..4}) assosiated to another variable (let's call it ${MY_SCORE_METHOD}), which would select one from 4 possibilities in some third function. Shall it be nessesary to define these new args within parse_args(), which already has --sort argument ? Note that the both arguments (--sort and --score) are not correlated and the new one activating ("${MY_SCORE_METHOD}"=1-4) should activate one of the 4 AWK commands (like it was done using sort in sort_data ()) but in another function... –  Apr 28 '21 at 12:15
  • @HotJAMS Still in scope of the question, so OK : The `parse_args` is a method for parsing arguments, so yes you should add here a section `if [ "${MY_ARG:0:7}" = "--score" ]; then ... fi` to get a `MY_SCORE_METHOD` value that you should initialize too with a default score. The `parse_args` method will only parse present arguments, so it will parse --sort only, --score only or both --sort and --score. – Zilog80 Apr 28 '21 at 12:30
  • Right, thank you! Actually this is what I need... In fact my question was mainly related to the introduction of this second loop inside the same function since I thought it should be rather like if [ "${MY_ARG2:0:6}" = "--score" ]; then ... fi, so essentially I did not understand well what was "${MY_ARG:0:6}".. Anyway does it mean that the number of possibilities is determined in MY_ARG_VALUE="${MY_ARG:6:2}", like it was for two options in --sort ? –  Apr 28 '21 at 12:36
  • @HotJAMS `${MY_ARG:0:6}` means "extract 6 digits as a string from digit 0 in ${MY_ARG}", and more simply extracts the first 6 characters from ${MY_ARGS}. Check bash man page, "Parameter expansion" section. – Zilog80 Apr 28 '21 at 12:57
  • OK I understand, I suppose I find a bug related to correlations between the both assigned arguments. I am going to create a new topic to figure it with the code example. –  Apr 28 '21 at 13:07
  • Here it is and many thanks in advance: https://stackoverflow.com/questions/67301002/bash-two-independent-triggers-assosiated-with-two-independent-arguments –  Apr 28 '21 at 13:17