0

I am trying to bash script where it executes following java jar command in batch mode on the basis of size of list.

i have a list of string with 54 states of USA.

list = [USA+CA,USA+TX,USA+TN...]

i need to execute following command in parallel with 4 instances having 4 values as input from list.

java -jar test-execute.jar --localities=USA+TX,USA+CA,USA+TN,USA+AB

wait till execution is complete for any instance, then start new instance of jar with next 4 states.

array=( USA+TX,USA+CA,USA+TN,USA+AB )
for i in "${array[@]}"
do
    java -jar test-execute.jar --localities= ???
done

I am not able to understand how can i dynamically provide inputs from array into jar execution.

so,

i have list of size 54,

i need to run 4 java instances in parallel with each instance having 4 unique state as input from list of 54. , once these instances complete, then start next 4 instances with next unique 4 states per instance.

update:

i have list of 54 states , 16 core machine. each java jar instance will use 4 cores, so i can run 4 java instances at a time to to use 16 core machine .

16 core machine 
   java instance-1 4 states
   java instance-2 4 states
   java instance-3 4 states
   java instance-4 4 states
 
wait till any of these instances complete, once completed, start new instance with next 4 states until all 54 states has been executed. 

please help.

deewreck
  • 113
  • 7
  • You can use the `wait` command with `&` - see details here https://www.cyberciti.biz/faq/how-to-run-command-or-code-in-parallel-in-bash-shell-under-linux-or-unix/ also refer to a this question - https://stackoverflow.com/questions/3004811/how-do-you-run-multiple-programs-in-parallel-from-a-bash-script – Vini Dec 19 '22 at 07:59

1 Answers1

1

First of all, bash array must be assigned as a space separated list as:

array=("USA+AL" "USA+AK" "USA+AZ" "USA+AR" "USA+AS" "USA+CA" ...)

Then would you please try something like:

array=("USA+AL" "USA+AK" "USA+AZ" "USA+AR" "USA+AS" "USA+CA" ...)
for (( i = 0; i < ${#array[@]}; i+=4 )); do
    echo java -jar test-execute.jar --localities="$(IFS=,; echo "${array[*]:i:4}")"
done
  • The for loop has a C-like syntax to increment the index by value 4.
  • The array slice ${array[*]:i:4} divides the array into sub-arrays of every four elements starting with i'th index. The last chunk with two elements are treated as well.
  • $(IFS=,; echo "${array[*]:i:4}") joins the array with commas to be fed to java as an argument.

If the output looks good, drop echo in front of java.

[Edit]
As for the parallelism, we can make use of -P option to xargs. Would you please try:

array=("USA+AL" "USA+AK" "USA+AZ" "USA+AR" "USA+AS" "USA+CA" ...)
for (( i = 0; i < ${#array[@]}; i+=4 )); do
    printf "%s\n" "$(IFS=,; echo "${array[*]:i:4}")"
done | xargs -P4 -L1 -I{} java -jar test-execute.jar --localities="{}"
  • It groups four states into an argument.
  • The -P4 option generates four processes at a time.
  • Then 16 states are processed in total at once.
tshiono
  • 21,248
  • 2
  • 14
  • 22
  • thanks for this. but as one more requirement for my case is that at any point of time we can only run 4 instances at a time. with your input provided, it does splits list with instances having 4 state..... but if my list is of size 54, then following setup will create 13 instances of jar.. i want to restrict it to 4 as well. 4 instance at a time. – deewreck Dec 19 '22 at 08:44
  • Thank you for the feedback. Then we can split the list into four chunks: 14 states, 14 states, 14 states and remaining 12 states. Am I correct? – tshiono Dec 19 '22 at 08:52
  • no.. first run, instance-1 4 states, instance-2 4 states, instance-3 4 states, instance-4 4 states.. second run, instance-1 4 states, instance-2 4 states, instance-3 4 states, instance-4 4 states ... so we are clear each java jar can take 4 input at max.. now at a time we can only run 4 instane.. i am doing this cuz i have 16 core machine, which i want to utilize at fullest.. so each instance uses 4 core.. 4 instances running would use 16 core. . let me know if this make it clearer.. i can update question with exact req – deewreck Dec 19 '22 at 09:03
  • Thank for the explanation. I hope I'm getting it. Then I'd like to know how I can parallelize the four instances in terms of syntax. Can I just put four java statements in order? – tshiono Dec 19 '22 at 09:11
  • since there is no hard rule for ordering , either we can run java -jar test-execute.jar --localities= 4 times in a loop to start 4 instances. but bigger question would be how can we ensure that instances has completed its execution and now new instance can be started... I can however only run max 4 instances at any point of time. – deewreck Dec 19 '22 at 09:15
  • if that can't be done then i suppose splitting list into 14, 14, 14, 12 is viable option.. since my list will be fixed and constant for now. i can just fire 4 java -jar from bash with 14 14 14 12 inputs respectively – deewreck Dec 19 '22 at 09:18
  • Thank you for the detailed explanations. I've updated my answer with the parallelism using `xargs`. Hope it will work. BR. – tshiono Dec 19 '22 at 11:43
  • thanks .. let me give this a try.. just wanted to check, since -p4 will start 4 process at a time, will it keep observing those 4 process before starting next set of process to take next 16 states?? thanks – deewreck Dec 19 '22 at 12:05
  • That's correct. `Exec -P4` controls so that at most 4 processes run at a time. – tshiono Dec 19 '22 at 12:16