Heuristic for A*-Algorithm with irregular distances between nodes

Question

I am currently working on an implementation of the A* Algorithm with irregular distances between two nodes. The graph containing the nodes is a directed and weighted graph. Every node is connected to at least one other node, there may also be symmetrical connections with different distances. A node is nothing more than a label and doesn't contain any special information

What I need is a heuristic to determine the shortest path from any node A to another node B as accurate as possible. I tried to use a heuristic that returns the distance to the nearest neighbor of a node, but of course that wasn't as effective as no heuristic at all (= Dijkstra).

My implementation of the A* Algorithm consists mainly of 2 classes, the class for the algorithm itself (AStar) and one for the nodes (Node). The code is heavily based on the Wikipedia pseudocode.

Source code of `AStar.java`

public class AStar {
    private AStar() {}

    private static Node[] reconstructPath(Map<Node, Node> paths, Node current) {
        List<Node> path = new ArrayList<Node>();
        path.add(0, current);
        while (paths.containsKey(current)) {
            current = paths.get(current);
            path.add(0, current);
        }
        return path.toArray(new Node[0]);
    }

    public static Node[] calculate(Node start, Node target, IHeuristic heuristic) {
        List<Node> closed = new ArrayList<Node>();
        PriorityQueue<Node> open = new PriorityQueue<Node>();
        Map<Node, Double> g_score = new HashMap<Node, Double>();
        Map<Node, Double> f_score = new HashMap<Node, Double>();
        Map<Node, Node> paths = new HashMap<Node, Node>();

        g_score.put(start, 0d);
        f_score.put(start, g_score.get(start) + heuristic.estimateDistance(start, target));
        open.set(start, f_score.get(start));

        while (!open.isEmpty()) {
            Node current = null;

            // find the node with lowest f_score value
            double min_f_score = Double.POSITIVE_INFINITY;
            for (Entry<Node, Double> entry : f_score.entrySet()) {
                if (!closed.contains(entry.getKey()) && entry.getValue() < min_f_score) {
                    min_f_score = entry.getValue();
                    current = entry.getKey();
                }
            }

            if (current.equals(target)) return reconstructPath(paths, target);

            open.remove(current);
            closed.add(current);

            for (Node neighbor : current.getAdjacentNodes()) {
                if (closed.contains(neighbor)) {
                    continue;
                }
                double tentative_g_score = g_score.get(current) + current.getDistance(neighbor);

                if (!open.contains(neighbor) || tentative_g_score < g_score.get(neighbor)) {
                    paths.put(neighbor, current);
                    g_score.put(neighbor, tentative_g_score);
                    f_score.put(neighbor, g_score.get(neighbor) + heuristic.estimateDistance(neighbor, target));
                    if (!open.contains(neighbor)) {
                        open.set(neighbor, f_score.get(neighbor));
                    }
                }
            }
        }
        throw new RuntimeException("no path between " + start + " and " + target);
    }
}

Source code of `Node.java`

public class Node {
    private Map<Node, Double> distances = new HashMap<Node, Double>();

    public final String       name;

    public Node(String name) {
        this.name = name;
    }

    public Set<Node> getAdjacentNodes() {
        return Collections.unmodifiableSet(distances.keySet());
    }

    public double getDistance(Node node) {
        return distances.get(node);
    }

    public void setDistance(Node node, double distance) {
        distances.put(node, distance);
    }

    @Override
    public String toString() {
        return (name == null ? "Node@" + Integer.toHexString(hashCode()) : name);
    }
}

Source code of `PriorityQueue.java`

public class PriorityQueue<T> {
    transient ArrayList<PriorityEntry<T>> elements     = null;

    private static final int              DEFAULT_SIZE = 10;

    public PriorityQueue() {
        elements = new ArrayList<PriorityEntry<T>>(DEFAULT_SIZE);
    }

    public PriorityQueue(int initialCapacity) {
        elements = new ArrayList<PriorityEntry<T>>(initialCapacity);
    }

    public boolean push(T element, double priority) {
        PriorityEntry<T> entry = new PriorityEntry<T>(element, priority);
        if (elements.contains(entry)) return false;
        elements.add(entry);
        elements.sort(null);
        return true;
    }

    public void set(T element, double priority) {
        PriorityEntry<T> entry = new PriorityEntry<T>(element, priority);
        int index = elements.indexOf(entry);
        if (index >= 0) {
            elements.get(index).setPriority(priority);
        } else {
            elements.add(entry);
        }
        elements.sort(null);
    }

    public T peek() {
        return size() <= 0 ? null : elements.get(0).getValue();
    }

    public T pop() {
        return size() <= 0 ? null : elements.remove(0).getValue();
    }

    public boolean remove(T element) {
        return elements.remove(new PriorityEntry<T>(element, 0));
    }

    public int size() {
        return elements.size();
    }

    public boolean isEmpty() {
        return elements.isEmpty();
    }

    public boolean contains(T element) {
        return elements.contains(new PriorityEntry<T>(element, 0));
    }

    private class PriorityEntry<E> implements Comparable<PriorityEntry<? extends T>> {
        private final E value;
        private double  priority = Double.MIN_VALUE;

        public PriorityEntry(E value, double priority) {
            this.value = value;
            this.priority = priority;
        }

        public E getValue() {
            return value;
        }

        public double getPriority() {
            return priority;
        }

        public void setPriority(double priority) {
            this.priority = priority;
        }

        @Override
        @SuppressWarnings("unchecked")
        public boolean equals(Object o) {
            if (!(o instanceof PriorityEntry)) return false;
            PriorityEntry<?> entry = (PriorityEntry<?>) o;
            return value.equals(entry);
        }

        @Override
        public int compareTo(PriorityEntry<? extends T> entry) {
            return (int) (getPriority() - entry.getPriority());
        }
    }
}

@Codor "irregular distances" as in the distance between two nodes is not always the same. — mezzodrinker, Dec 02 '14 at 12:10
A* is sensitive to the used heuristic. Are you using an [admissible](http://en.wikipedia.org/wiki/Admissible_heuristic) one? — kiheru, Dec 02 '14 at 12:16
@flashdrive2049 If what you mean is "The distance from A to B may not be the same as the distance from B to A", then that should pose no problem to A*. The most trivial example of this is a graph that is not bidirectional - the "absent" edges can be modeled as edges with infinite weight. You may want to explain you heuristic function better to get help. — Ordous, Dec 02 '14 at 12:24
@Ordous No, that's not it. The distance from A to B might be different to the distance from C to D. — mezzodrinker, Dec 02 '14 at 12:30
@flashdrive2049 Oh, that's even easier then. In my experience that's the standard type of graph, with the unweighted one being a special case. — Ordous, Dec 02 '14 at 12:33
Anyway, your code seems to not compile - for example PriorityQueue does not have a set method which you use on open — Ordous, Dec 02 '14 at 12:50
@Ordous Oh, sorry. I forgot to include the code of PriorityQueue, which I implemented myself. I'll add it to the question right away. — mezzodrinker, Dec 02 '14 at 12:55
@flashdrive2049 Interesting that you chose to implement a custom `PriorityQueue`, yet the only methods you seem to invoke on it is `set`, `contains` `remove` and `isEmpty`, all of which can be done with a simple `Set`. Still, after a bit of pondering the code *seems* reasonable. As kiheru already said - the main suspect here is the heuristic. — Ordous, Dec 02 '14 at 13:12
@kiheru and Ordous: Well, the problem is: I don't have a heuristic. I tried using the Manhattan distance (which doesn't work in this case, I don't have meaningful coordinates as in the actual distance may be greater or less than the distance of points) and the nearest neighbor heuristic (which is pretty inaccurate, as stated in the question). I (1) don't know any other heuristics and (2) don't know how to implement other heuristics. — mezzodrinker, Dec 02 '14 at 13:24
@flashdrive2049 So your intended heuristic is that the algorithm should visit the closest neighbor first? I.e. - the heuristic always returns 0? — Ordous, Dec 02 '14 at 13:40
@Ordous No, I intend to find the shortest way from A to B, where ever they might be. — mezzodrinker, Dec 02 '14 at 13:43
Not being to have a reasonable heuristic is a problem: an overestimating heuristic means that the algorithm can't be guaranteed to find the *shortest* path. If there's no way to come up with such an heuristic, you'll need to use another algorithm — kiheru, Dec 02 '14 at 14:00
@flashdrive2049 So what is *the actual form* of your heuristic? What does `heuristic.estimateDistance(A, B)` return? Googling "nearest neighbor heuristic" doesn't give any meaningful results, at least not on the first page. If your heuristic is wrong (note - not *bad*, but wrong - i.e. not admissible), then you will *not* end up with the optimal solution, but rather *some* solution. The trivial heuristic is no heuristic that makes A* behave like Dijkstra - is the one that returns 0 for any input. What does *yours* return? — Ordous, Dec 02 '14 at 15:19
@Ordous _My_ heuristic is an interface. ```IHeuristic.estimateDistance(Node, Node)``` returns a ```double``` value that represents the estimated distance from a node (A) to the target node (B). And, as I said, I didn't have a real heuristic. The Manhattan heuristic didn't work out and returning the distance to the nearest neighbor of A didn't work out either. I'm out of ideas concerning other heuristics. — mezzodrinker, Dec 02 '14 at 16:03
@kiheru Well, yeah. You're right. If I don't find a suitable heuristic, I will have to resort to Dijkstra. No way around that. — mezzodrinker, Dec 02 '14 at 16:04
@flashdrive2049 So your question boils down to: How does one create an admissible heuristic for an arbitrary graph? — Ordous, Dec 02 '14 at 16:52
@Ordous Kind of. I've never been good at asking questions well. Thanks. — mezzodrinker, Dec 02 '14 at 17:06
@flashdrive2049 Then most of the text in the question is not relevant :) Googling the above will give you 2 questions with answers: One on Programmers.SE with an answer by me: [Shortest Path Between Two Nodes in a +10 Million Nodes Graph](http://programmers.stackexchange.com/questions/261163/shortest-path-between-two-nodes-in-a-10-million-nodes-graph) and one on SO: [What algorithms compute directions from point A to point B on a map?](http://stackoverflow.com/questions/430142/what-algorithms-compute-directions-from-point-a-to-point-b-on-a-map). Both of these propose versions of heuristics. — Ordous, Dec 02 '14 at 17:10
@flashdrive2049 While I do enjoy getting my internet points, I think it's best to reformulate this question and mark it as duplicate. If any of the answers linked helped you - feel free to upvote them (since one is mine I'd get my fair share of rep anyway) — Ordous, Dec 02 '14 at 17:19
http://stackoverflow.com/questions/39256309/calculate-the-shortest-route/39256428#39256428 — Alex Filatov, Aug 31 '16 at 21:38

score 1 · Answer 1 · answered Dec 15 '14 at 09:37

Before trying to define a heuristic function for your problem, consider that in many cases, using a poor (or incorrect) estimation of the cost to the goal is as self-defeating as not using an heuristic at all.

In the case of a graph with weighted arcs, you need to consider if there is some information in the nodes which can lead to obtain the heuristic values (for example, if your nodes are cities, a good estimation can be the lenght of the straight line between them; or if your nodes are arrays, a similarity measurement between them). If your nodes are only labels and there is no information that you can use to obtain your heuristic values, maybe the best solution is not using a heuristic at all. This is not the worst scenario for most problems of this type. It is better to use a Dijkstra search (which is the same of A* but using heuristic=0), letting the algorithm expand the nodes based on the cost from the start, than using a bad heuristic which is not consistent, because in this situation you might be expanding unncecesary nodes that seem to be promising based on a bad estimation of the cost to the goal.

I don't know how big are your graphs, but for most problems there is not a significant difference in computation time between using a heuristic and don't using it at all. Specially in the case of a bad heuristic.

I can see that you have your own implementation of A*. I recommend you to take a look of an heuristic search library like Hipster. This library allows you to define your graph and test different search algorithms to know the best one that fits your problem. Some code examples describe exactly your case: seach in weighted directed graphs. It might be useful for your problem.

I hope my answer helps. Regards,

score 1 · Answer 2 · answered Dec 15 '14 at 11:01

Without going into other possible problems I would like to point out the main problem - you lack the plane. In case of shortest distance problem between cities, you have

node - city
weight - numerical value describing cost to get from city a to city b
plane - describes environment ex: city position (in your square grid)

From plane you can extrapolate meaningful heuristic. For example you can make assumption from city position like city with lowest arithmetical distance should be looked first.

If you do not have a plane you do not have any means to predict a meaningful heuristic. A* will still work, but it hardly differs from exhaustive search. You can create plane from the weight, but it .

Ex:

Iterate over weights and find the weight quantiles 20/60/20 - now you have a relative plane

- good weight threshold (lowest 20%) 
- average weight threshold (middle 60%)
- bad weight threshold (highest 20%)

With priority queue your algorithm would pick good moves, then average and finally bad ones. If you want you can have more then 3 segments.

Just as a reminder: A* returns good enough result fast. Using exhaustive search you can find the best solution, but it would become exponentially slower if problem size grows.

score 0 · Answer 3 · answered Dec 02 '14 at 14:22

0

To add to @kiheru comment above. Your solution will only be as good as the heuristic provided.

If the following line and the heuristic.estimate, has too narrow of a scope. The algorithm will quickly reach a local minimum. Alternatively, if the heuristic isn't admissible the algorithm will result in either no solution or an incorrect random solution.

    f_score.put(start, g_score.get(start) + heuristic.estimateDistance(start, target));

Take a close look at your heuristic and confirm it's admissible. If it is admissible, it may need to be improved in order to provide a more accurate estimate.

answered Dec 02 '14 at 14:22

agreene

1
3

I know that the search algorithm is just as good as the heuristic provided. What I don't understand is why the line you quoted has a too narrow scope. – mezzodrinker Dec 02 '14 at 14:36
Sorry for the confusion. I don't know that the line has too narrow of a scope. My intention, was to point out the following. As I understood the question, you are getting poor results from the algorithm. Since most of your code follows the Pseudo on Wikipedia, the issue likely resides in the heuristic function. If the heuristic is giving bad/poor results, A* will not provide an optimal solution. – agreene Dec 02 '14 at 14:57
I know that the problem is the heuristic. The only thing I need is actually the heuristic. The closest neighbor heuristic (see comments on the question) didn't work out well (no surprise) and the Manhattan heuristic doesn't work at all because (see comments on BrendanM's answer) the coordinates of my Node (which were removed by now) are in no relation to the actual distance between two nodes. – mezzodrinker Dec 02 '14 at 16:12

score 0 · Answer 4 · answered Dec 02 '14 at 14:36

0

In the case of your node class it seems to have an X and Y if these represent the node's position in a 2D space maybe you could use a heuristic based on the straight line distance between the nodes calculated from the X and Y values.

answered Dec 02 '14 at 14:36

BrendanM

383
3
14

You're right, the node contains 2D coordinates. This is still from a version where I used the Nodes to represent a square in a square grid, I should probably remove that to make it more clear. The problem about calculating the distance by coordinate is that the distance between two nodes has _nothing_ to do with their relative coordinates. – mezzodrinker Dec 02 '14 at 16:06
Ah ok probably shouldnt have assumed they related to the node position – BrendanM Dec 02 '14 at 16:11

Heuristic for A*-Algorithm with irregular distances between nodes

Source code of AStar.java

Source code of Node.java

Source code of PriorityQueue.java

4 Answers4

Source code of `AStar.java`

Source code of `Node.java`

Source code of `PriorityQueue.java`