7

I implemented a genetic algorithm to solve an enhanced Traveling Salesman Problem (the weight of the edges changes with the time of the day). Currently I'm evaluating the different parameters of my simulation and I stumbled upon a correlation I can't explain to myself:

mutation rate - runtime

A higher mutation rate leads to a lower run time. Personally I would assume the contrary, since a higher mutation rate produces more operations. (25% Mutation Rate is 12% faster than 5%)

The best results are achieved by a mutation rate of 8% (5% is better than 10% and 25% performs worst (except 0%)) A lower fitness value ist better.

result - mutation rate

The iteration count is set by the generation parameter which is set to 10.000 in all test cases.

Each test case is executed 10 times.

My implementation (in python) of the mutation looks like this:

def mutate(self,p):
    for i in self.inhabitants:
        r = random()
        if r <= p:
            i.mutate()

p is the mutation rate

The mutation looks like this

def mutate(self):
    r1 = randint(0,self.locations.size()-1)
    r2 = randint(0,self.locations.size()-1)
    self.locations.swap(r1,r2)

Why does a higher mutation rate lead to faster execution time?

Edit: I actually ran the same tests on my Raspberry Pi (which is 9 times slower) and it results in the same outcome:

time - mutation on pi

Bernd Strehl
  • 2,852
  • 4
  • 24
  • 43
  • Less run time means less operations being performed (generally speaking). Higher "p" will lead to "i.mutate()" being called more often. Does "i.mutate()" change the "self.inhabitants" variable? Could you show the code for that function? (or a minimum working example if possible) – armatita Mar 23 '16 at 13:20
  • Also could you try to create a local copy of self.inhabitants in the function mutate, and loop the copy instead? – armatita Mar 23 '16 at 13:21
  • @armatita I added the code for the mutation. I don't understand what you mean with loop the copy. – Bernd Strehl Mar 23 '16 at 13:26
  • The mutation function does not do anything to self.inhabitants apparently: By local I mean "local = self.inhabitants" and than "for i in local: ...". Also when and where do you change the value for self.inhabitants? – armatita Mar 23 '16 at 13:35
  • Why do you think it doesn't do anything? It swaps two locations. If nothing would happen, we would see no difference in the result/fitness. – Bernd Strehl Mar 23 '16 at 13:41
  • I meant that nothing happens to "self.inhabitants" variable (more or less individuals). Or if it does it's not obvious from the "mutate" function. My suspicion is that somehow during your mutate methods you are reducing the size of your population (self.inhabitants). That's why I advised you to create a local copy for the loop. So if self.inhabitants is changed you'll still have the local copy working fully. – armatita Mar 23 '16 at 13:45
  • How are you doing the timing? This might be an artifact due to something like how the interpreter is doing its allocations. An interesting experiment would be to reverse the order in which you are testing the mutation rates -- time the higher rates before you time the lower rates. – John Coleman Mar 23 '16 at 13:48
  • @armatita If I would reduce the population size (which is set to 10) I would have noticed very fast, because the algorithm crashes (It cannot find two parents, etc.) @JohnColeman I use `start_time = time.time()` before I run the simulation and stop it with `t = (time.time() - start_time)` - that's a gread idea, I'm going to try this as soon as my current tests stop. – Bernd Strehl Mar 23 '16 at 14:05
  • I haven't seen your entire algorithm but I wouldn't be too sure. In any case another alternative is to make a global counter for the loop. The counter starts at 0 when you start your script a is never reinitialized. Update the counter inside the loop. See how many times the counter was used for each test. – armatita Mar 23 '16 at 14:16
  • @armatita I tested your proposal in another way: In each iteration I checked if the population size is still the same, and it actually stays the same, so there are no inhabitants that are removed. – Bernd Strehl Mar 23 '16 at 14:30
  • @armatita I also set up a counter. Regardless of the mutation rate it's 10.000 at the end. (with 10.000 generations) – Bernd Strehl Mar 23 '16 at 14:35
  • Where did you update the counter? It should depend on the number of inhabitants as well as the number of generations. The time is being spent in something and calling i.mutate() can't be faster than doing nothing. – armatita Mar 23 '16 at 14:37
  • @armatita I now set up a counter that counts the mutations. And it is as expected ~5 times higher with mutation rate 5 times higher. Maybe there's something other. The next thing I'll try is to reverse the test order. – Bernd Strehl Mar 23 '16 at 14:41
  • Perhaps the evaluation function takes longer to evaluate in some cases than others and that this is what you are seeing. Perhaps you can profile runs with different parameters and see what part of the code is taking up the extra cpu cycles for the smaller mutation rates. See https://docs.python.org/2/library/profile.html – John Coleman Mar 23 '16 at 16:41

2 Answers2

3

It is impossible to know without seeing your full code, but the following seems plausible:

When the mutation rate is lower then, after a few generations, the population becomes much more homogenous than it does when the mutation rate is higher. Underneath the assumption that you are using some variation of roulette wheel sampling, a more homogeneous population means that each "spin" of the roulette wheel takes on average longer than when you have a more varied population (where relatively few members will dominate the distribution of fitnesses and hence tend to be selected after scanning over fewer members of the population).

To be more certain, you could use a profiling tool such as cProfile to see exactly where those CPU cycles are going.

Community
  • 1
  • 1
John Coleman
  • 51,337
  • 7
  • 54
  • 119
0

Each cycle i of mutations has a probability pi of providing an acceptable solution, and the time it takes to evaluate a cycle is Ti. Both pi and Ti increase with the mutation rate. The expected running time of your algorithm is thus a sum Σ piTi over the expected number of cycles required to find the answer. A higher mutation rate increases the size of each term, but decreases the number of terms to sum. There is an optimal mutation rate that minimizes this sum.

chepner
  • 497,756
  • 71
  • 530
  • 681
  • Could be, that I don't understand the answer fully, but the expected number of cycles doesn't decrease since it's determined by the generation parameter, which states how many times crossover and mutation is performed. – Bernd Strehl Mar 23 '16 at 13:30
  • 2
    @chepner how do you know mutation decreases the number of terms to sum? I'm not seeing anything pointing to that. – armatita Mar 23 '16 at 13:33
  • Ah, I missed that the generation count was fixed; I was assuming there was a threshold for determining when a solution was good enough. – chepner Mar 23 '16 at 13:43
  • I think he's changing the self.inhabitants somewhere along his mutate functions making the initial loop shorter. But I'm not sure if that's the purpose of the algorithm. If it is that is the explanation, otherwise this could be a bug in the procedure. – armatita Mar 23 '16 at 13:47