1

I have the following uni assignment that's been puzzling me. I have to implement a genetic algorithm that allocates processes into processors. More specifically the problem is the following:

"You have a program that is computed in parallel processor system. The program is made up of a N number of processes that need to be allocated on a n number of processors (where n is way smaller than N). The communication of processes during this whole process can be quite time consuming, so the best practice would be to assign processes that need intercommunication with one another to same processor.

In order to reduce the communication time between processes you could allocate of these processes to the same processor but this would negate the parallel processing idea that every processor needs to contribute to the whole process.

Consider the following: Let's say that Cij is the total amount of communication between process i and process j. Assume that every process needs the same amount of computing power so that the limitations of the processing process can be handled by assigning the same amount of processes to a processor. Use a genetic algorithm to assign N processes to n processors."

The above is roughly translated the description of the problem. Now I have the following question that puzzle me.

1) What would be the best viable solution in order to for the genetic algorithm to run. I have the theory behind them and I have deduced that you need a best possible solution in order to check each generation of the produced population.

2) How can I properly design the whole problem in order to be handled by a program.

I am planning to implement this in Java but any other recommendations for other programming languages would be welcome.

akortex
  • 5,067
  • 2
  • 25
  • 57

1 Answers1

0

The Dude abides. Or El Duderino if you're not into the whole brevity thing. What you're asking is really a two part question, but the Genetic Algorithm part can be easily illustrated in concept. I find that giving a basic start can be helpful, but this problem as a "whole" is too complicated to address here.

Genetic Algorithms (GA) can be used as an optimizer, as you note. In order to apply a GA to a process execution plan, you need to be able to score an execution plan, then clone and mutate the best plans. A GA works by running several plans, cloning the best, and then mutating some of them slightly to see if the offspring (cloned) plans are improved or worsened.

In this example, I created a array of 100 random Integers. Each Integer is a "process" to be run and the value of the Integer is the "cost" of running that individual process.

List<Integer> processes = new ArrayList<Integer>();

The processes are then added to an ExecutionPlan, which is a List<List<Integer>>. This List of List of Integers will be used to represent 4 processors doing 25 rounds of processing:

class ExecutionPlan implements Comparable<ExecutionPlan> {
    List<List<Integer>> plan;
    int cost;

The total cost of an execution plan will be computed by taking the highest process cost per round (the greatest Integer) and summing the costs of all the rounds. Thus, the goal of the optimizer is to distribute the initial 100 integers (processes) into 25 rounds of "processing" on 4 "processors" such that total cost is as low as possible.

// make 10 execution plans of 25 execution rounds running on 4 processors;        
List<ExecutionPlan> executionPlans = createAndIntializePlans(processes);
// Loop on generationCount
for ( int generation = 0; generation < GENERATIONCOUNT; ++generation) {
    computeCostOfPlans(executionPlans);
    // sort plans by cost
    Collections.sort(executionPlans);
    // print execution plan costs
    System.out.println(generation + " = " + executionPlans);
    // clone 5 better plans over 5 worse plans 
    // i.e., kill off the least fit and reproduce the best fit.
    cloneBetterPlansOverWorsePlans(executionPlans);
    // mutate 5 cloned plans
    mutateClones(executionPlans);
}

When the program is run, the cost is initially randomly determined, but with each generation it improves. If you run it for 1000 generations and plot the results, a typical run will look like this:

Best Cost Each Generation

The purpose of the GA is to Optimize or attempt to determine the best possible solution. The reason it can be applied to you problem is that your ExecutionPlan can be scored, cloned and mutated. The path to success, therefore, is to separate the problems in your mind. First, figure out how you can make an execution plan that can be scored as to what the cost will be to actually run it on an assumed set of hardware. Add rountines to clone and mutate an ExecutionPlan. Once you have that plug it into this GA example. Good luck and stay cool dude.

public class Optimize {
    private static int GENERATIONCOUNT = 1000;
    private static int PROCESSCOUNT = 100;
    private static int MUTATIONCOUNT = PROCESSCOUNT/10;
    public static void main(String...strings) {
        new Optimize().run();
    }
    // define an execution plan as 25 runs on 4 processors
    class ExecutionPlan implements Comparable<ExecutionPlan> {
        List<List<Integer>> plan;
        int cost;
        public ExecutionPlan(List<List<Integer>> plan) {
            this.plan = plan;
        }
        @Override
        public int compareTo(ExecutionPlan o) {
            return cost-o.cost;
        }
        @Override
        public String toString() {
            return Integer.toString(cost);
        }
    }
    private void run() {
        // make 100 processes to be completed
        List<Integer> processes = new ArrayList<Integer>();
        // assign them a random cost between 1 and 100;
        for ( int index = 0; index < PROCESSCOUNT; ++index) {
            processes.add( new Integer((int)(Math.random() * 99.0)+1));
        }
        // make 10 execution plans of 25 execution rounds running on 4 processors;        
        List<ExecutionPlan> executionPlans = createAndIntializePlans(processes);
        // Loop on generationCount
        for ( int generation = 0; generation < GENERATIONCOUNT; ++generation) {
            computeCostOfPlans(executionPlans);
            // sort plans by cost
            Collections.sort(executionPlans);
            // print execution plan costs
            System.out.println(generation + " = " + executionPlans);
            // clone 5 better plans over 5 worse plans 
            cloneBetterPlansOverWorsePlans(executionPlans);
            // mutate 5 cloned plans
            mutateClones(executionPlans);
        }
    }
    private void mutateClones(List<ExecutionPlan> executionPlans) {
        for ( int index = 0; index < executionPlans.size()/2; ++index) {
            ExecutionPlan execution = executionPlans.get(index);
            // mutate 10 different location swaps, maybe the plan improves
            for ( int mutationCount = 0; mutationCount < MUTATIONCOUNT ; ++mutationCount) {
                int location1 = (int)(Math.random() * 100.0);
                int location2 = (int)(Math.random() * 100.0);
                // swap two locations
                Integer processCostTemp = execution.plan.get(location1/4).get(location1%4);
                execution.plan.get(location1/4).set(location1%4, execution.plan.get(location2/4).get(location2%4));
                execution.plan.get(location2/4).set(location2%4, processCostTemp); 
            }
        }

    }
    private void cloneBetterPlansOverWorsePlans(List<ExecutionPlan> executionPlans) {
        for ( int index = 0; index < executionPlans.size()/2; ++index) {
            ExecutionPlan execution = executionPlans.get(index);
            List<List<Integer>> clonePlan = new ArrayList<List<Integer>>();
            for ( int roundNumber = 0; roundNumber < 25; ++roundNumber) {
                clonePlan.add( new ArrayList<Integer>(execution.plan.get(roundNumber)) );
            }
            executionPlans.set( index + executionPlans.size()/2, new ExecutionPlan(clonePlan) );
        }        
    }
    private void computeCostOfPlans(List<ExecutionPlan> executionPlans) {
        for ( ExecutionPlan execution: executionPlans) {
            execution.cost = 0;
            for ( int roundNumber = 0; roundNumber < 25; ++roundNumber) {
                // cost of a round is greatest "communication time".
                List<Integer> round = execution.plan.get(roundNumber);
                int roundCost = round.get(0)>round.get(1)?round.get(0):round.get(1);
                roundCost = execution.cost>round.get(2)?roundCost:round.get(2);
                roundCost = execution.cost>round.get(3)?roundCost:round.get(3);
                // add all the round costs' to determine total plan cost
                execution.cost += roundCost;
            }
        }        
    }
    private List<ExecutionPlan> createAndIntializePlans(List<Integer> processes) {
        List<ExecutionPlan> executionPlans = new ArrayList<ExecutionPlan>();
        for ( int planNumber = 0; planNumber < 10; ++planNumber) {
            // randomize the processes for this plan
            Collections.shuffle(processes);
            // and make the plan 
            List<List<Integer>> currentPlan = new ArrayList<List<Integer>>();
            for ( int roundNumber = 0; roundNumber < 25; ++roundNumber) {
                List<Integer> round = new ArrayList<Integer>(); 
                round.add(processes.get(4*roundNumber+0));
                round.add(processes.get(4*roundNumber+1));
                round.add(processes.get(4*roundNumber+2));
                round.add(processes.get(4*roundNumber+3));
                currentPlan.add(round);
            }
            executionPlans.add(new ExecutionPlan(currentPlan));
        }
        return executionPlans;
    }
}
K.Nicholas
  • 10,956
  • 4
  • 46
  • 66
  • First of all I must say that the Dude is really thankful for your reply. – akortex Apr 29 '16 at 18:50
  • Honestly I did not have high hopes of getting such a detailed reply that quickly and on this topic. I have a few quick questions: 1) In your example the computational power required by each process is unequal to one another, whereas they should have been the same. Would this work for that case also? 2) You omitted the whole process intercommunication concept. How could this be modeled? 3) You do not seem to use the whole genome concept of GAs in order to have a check for member fitness. – akortex Apr 29 '16 at 18:58
  • Your question asks about `the best viable solution`. This essentially asks about the sophistication of the execution plan. For a discussion related to `best viable solution` see three part answer in [AI How to model genetic programming for Battleships](http://stackoverflow.com/questions/35820854/ai-how-to-model-genetic-programming-for-battleships) – K.Nicholas Apr 30 '16 at 12:56
  • Hi. 1) The processes must be unequal in some fashion or there wouldn't be any problem to solve. You can think of the numbers as communication costs or anything you like. 2) It's complex to model and something you have to figure out. You would need, at the minimum, structures for the hardware capabilities and processes requirements. Then you need structure (class) for an execution plan. 3) Sure I do, each `ExecutionPlan` has a List> plan. That would be the genome. Each of the integers a chromosome. These are the things that are `reproduced` and `mutated`. Let's go bowling Dude. – K.Nicholas Apr 30 '16 at 12:58
  • Well the Dude abides once more. When I started modelling this, I took a rather complicated approach which had the process, the processor etc as objects consisting of many attributes (PID, communication cost etc etc) which only made matters more difficult to deal with. I did not think of your approach which is way simpler. As for the genome again your approach is way simpler. I was trying to use a 16digit number assigned to each process. In any way, if you don't mind I will use what you said above as a basis for my implementation. Maybe I will require a couple of questions more. – akortex May 02 '16 at 08:41
  • Of course. Start simple and add little things as they make sense. You can't get a strike unless you knock down a few pins Dude! You could move the Integer into a class called `Process`, even name it `cost` or something of the sort. After that it might make sense to replace the inner array (`List`) with something to hold 4 (or however many Processes), name it ExecutionRound or something. It could even return the highest `cost`, simplifying the scoring for the GA. So on and so forth and before you know it it's time for a good Sarsaparilla. – K.Nicholas May 02 '16 at 17:55