1

This is mostly a consulting question.

I've developed a genetic algorithm to solve the TSP, long story short, I have written two different codes the and i used this as my dataset. the programs' found solution are shown below.

prog#1 solution

prog#2 solution

As it is obvious the suggested solution from the first program(Prog# 1) is much more promising than the second one(Prog# 2) to be optimized; and the solution from Prog# 2 seems to be a random solution most likely.

BUT the cost of Prog# 1 as i have calculated is 97314.36 and the cost of Prog# 2 is 74635.31 which is almost 20K smaller than the cost of solution from Prog# 1, and as costs are suggesting the solution found by Prog# 2 SHOULD BE much more optimized than the first solution.

Question

1) Why on earth the path plot of solution found by Prog# 2 does not support (visually) the calculated cost value?

2) Considering the blow scripts, is there anything i have missed?


For completeness i post the script i used to plot and calculating the cost value.

function tsp_plot_output() 
    close all;
    disp('loading data....');
    x=load('out.res'); 
    dist = 0;
    for i=1:2:size(x, 1)-1
        dist = dist + norm(x(i,:) - x(i+1,:), 2);
    end
    dist = dist + norm(x(1,:) - x(end,:), 2);
    fprintf('COST IS: %.2f\n', dist);
    disp('ploting data....');
    xx = x(:,1); xy = x(1:size(xx, 1),2);
    zxx = x(:,1); zxy = x(1:size(zxx),2);
    plot(xx, xy), title('Found TSP solution');
    hold
    plot(zxx, zxy, 'r.');
end

The code i used in Prog# 1 to pour out the solution is

std::ofstream os(outputfile);
BOOST_FOREACH(size_t city, *best->_genes) {
    auto loc = _data->get(city);
    os<<loc.longitude<<" "<<loc.latitude<<endl;
}
os.close(); 

And same code in Prog# 2

ofstream out(jconfig["output-file"].as_string());
for(int i = 0; i < p->lchrom; i++) {
    city c = data.at(best_found->chrom[i]);
    out<<c.x<<" "<<c.y<<endl;
}
out.close();
dariush
  • 3,191
  • 3
  • 24
  • 43
  • What do you mean by you have calculated. Which are you sure of: the visualization, or the cost, or neither. – luk32 Apr 09 '15 at 11:40
  • Cost simply is just sum of point-to-point euclidean distance, but there should be some reasonable relation between visualization and cost. which I don't see that in `Prog# 2 `'s visualization. – dariush Apr 09 '15 at 11:50
  • Also I don't think this is a very good question because the answer to 1) is: because there is a bug in either visualization or plotter. And the codes you show are 2x c++ snippets for printing data which do not tell anything beyond that they output two columns of data (there can be tons of problems, but I doubt there are any if output look resonable and program doesn't crash), and one unspecified script for plotting. You tagged it `c++` and c++ codes you show absolutely do not help in solving your issue. So it is very hard to tell if you missed something. – luk32 Apr 09 '15 at 11:50
  • I use matlab/octave to plot, so i don't think there is a bug in their plot :) and beyond the snippet `C++` code i have send is just genetic algorithm which has nothing to do with the outputting the result – dariush Apr 09 '15 at 11:54
  • I figured what is the meaning of cost. It is also obvious there should be a relation. But you said *you* calculated the value. I highly doubt you calculated ~100 distances. You have bug either in distance calculation or visualization/plotting. What makes you think there is a relation for Prog#1, maybe it should be 50k? I doubt there is a bug in their plot. From what plotting and distance calculation is done in `tsp_plot_output`. My bet is you have bug there. If it's octave, maybe you should ask question with `octave` tag whether your script does what you think it does. I'm not good enough =) – luk32 Apr 09 '15 at 12:00
  • You could also try smaller cases which you can check by hand, of course. Cheers and good luck =). – luk32 Apr 09 '15 at 12:03
  • Of course i calculated ~1000 distances, for 10000 solution for 2000 times :D that is the point of genetic algorithm. The same cost function i used to calculate the distance in `Prog# 1` i used in `Prog# 2` and is being used in the `tsp_plot_output` script, and the cost results are match with each other. – dariush Apr 09 '15 at 12:06

1 Answers1

1

Your distance calculation im MATLAB is wrong. You have:

dist = 0;
for i=1:2:size(x, 1)-1
    dist = dist + norm(x(i,:) - x(i+1,:), 2);
end
dist = dist + norm(x(1,:) - x(end,:), 2);

With for i=1:2:size(x,1)-1 you start at i=1, and then add 2 in each step until you reach size(x,1)-1. Therefore, you add the distances from 1-2, then from 3-4, and so on. Of course it should be from 1-2, then 2-3 and so on. This is achieved by

dist = 0;
for k=1:size(x,1)-1
    dist = dist + norm(x(k+1,:) - x(k,:),2);
end
dist = dist + norm(x(end,:) - x(1,:),2);

For an example x = [0,0; 1,1; 1,0] the old routine returned 2.4142, while the corrected one returns the correct sqrt(2) + 1 + 1 = 3.4142.

PS: I changed the running variable to k, as in MATLAB i stands for the imaginary unit (see this question for details). I also changed the order of the x's inside the norm. Of course yours wasn't wrong, but that way it is clear that you take the vector from the current point k to the next point k+1 and not the other direction.

Community
  • 1
  • 1
hbaderts
  • 14,136
  • 4
  • 41
  • 48
  • +1 you are right!! wonder how i came to this obvious mistake! I am running my program again with my cost function fixed, it would take an hour or two to complete, I'll get back to you about the result. – dariush Apr 09 '15 at 17:28
  • Yup, that was the problem, the new result confirms that, cheers:) – dariush Apr 09 '15 at 20:01