2

I'm trying to create an A* pathfinding algorithm, however I'm having a little trouble getting off the ground with it. A little background:

I am by no means versed in pathfinding algorithms, however I did touch upon this subject a couple years ago (I've since forgotten everything I've learned). I play EVE Online, which is an online game about internet spaceships. The developers release data dumps for static information (items in game, solar system locations, etc). I am trying to find the shortest route from solar system A to solar system B.

Take a look at this map: http://evemaps.dotlan.net/map/UUA-F4 That is one region in the game, with each node being a system. I would like to compute the shortest distance between any two of those systems.

My issue: everything that I've read online about A* is talking about incorporating the distance between two nodes (for example, the distance between two cities) to help compute the shortest path. That doesn't help my case, as I'm more interested in the number of hops (node 1 > node 2 > node 3) rather than the distance between those hops. I do not know how to modify the A* algorithm to incorporate this.

The information that I have in the database: A list of all systems and their neighbors (so, systemX links with systemA and systemB) x, y, and z coordinates of all systems in a 3D grid

If anyone can point me in the right direction, that would be great. I'm looking to use this in PHP, however I've also started to work in Python a bit so that'll work too.

Example data can be provided on request if needed.

EDIT

As some have pointed out, the 'cost' associated with each jump would simply be 1. However, with A*, you also need a heuristic that estimates the distance from the current node to the target node. I'm not exactly sure how to go about determining this value, as I'm not sure of the remaining hops. As stated, I do have the 3D coordinates (x,y,z) for every node, but I'm not sure if this could give any insight as the physical distance between each node is not of concern. I do know that no path spans more than 99 hops.

EDIT 2

MySQL data for the example region.

to -> from data: http://pastebin.com/gTuwdr7h

System information (x,y,z cooridinates if needed): http://pastebin.com/Vz3FD3Kz

blitzmann
  • 7,319
  • 5
  • 23
  • 29

2 Answers2

4

If the number of "hops" is what matters to you, then consider that to be your distance, meaning that if the two locations are connected by a single hop, the distance is one.

For A*, you'll need two things:

  1. The costs from one location to each neighbor, in your case, this seems to be constant (hops).

  2. An heuristic, that estimates the cost of going from your current "node" or location to the goal. How you can estimate this depends a lot on your problem. It's important that your heuristic doesn't *over*estimates the true cost, or else A* won't be able to guarantee the best result.

pcalcao
  • 15,789
  • 1
  • 44
  • 64
  • It'll be hard to define a heuristic though.. If you fill in 0 or 1, you end up with Dijkstra's algorithm instead. – flup Mar 29 '13 at 00:21
  • flup, I think that's one of the things I was confused about. If distance between all nodes is 1, how can I use A*? Could I possibly use 2 as a heuristic? It's kinda over my head, but the algorithm will obviously check if both nodes are the same or neighbors before going into a search loop, and thus "2" is the minimum distance that the A* will have to deal with... – blitzmann Mar 29 '13 at 00:33
  • Using 2 is basically the same as using 0 as the heuristic, since it's constant it won't help your algorithm perform any faster. The objective of an heuristic is to provide an aproximation allowing A* to choose a path that it predicts can reach the goal faster. – pcalcao Apr 01 '13 at 09:28
2

Take the upper part of the linked graph:

enter image description here

Assume that the lines represent 2 way (i.e., you can go to or from any linked node) and that the black lines are a 'cost' of 1 and the red lines are a 'cost' of 2.

That structure can be represented by the following Python data structure:

graph = {'Q-KCK3': {'3C-261':1, 'L-SDU7':1},
         'L-SDU7': {'Q-KCK3':1, '3C-261':1,'4-IPWK':1},
         '3C-261': {'4-IPWK':1,'9K-VDI':1,'L-SDU7':1,'U8MM-3':1},
         'U8MM-3': {'9K-VDI':1,'3C-261':1, '9K-VDI':1, 'Q8T-MC':2},
         'Q8T-MC': {'U8MM-3':2, 'H55-2R':1, 'VM-QFU':2},
         'H55-2R': {'Q8T-MC':1, '9XI-OX':1, 'A3-PAT':1, 'P6-DBM':1},
         'P6-DBM': {'A3-PAT':1, 'H55-2R':1},
         'A3-PAT': {'P6-DBM':1, 'H55-2R':1, '9XI-OX':1,'YRZ-E4':1},
         'YRZ-E4': {'A3-PAT':1}, 
         'VM-QFU': {'IEZW-V':1, 'PU-128':2},
         'IEZW-V': {'VM-QFU':1, 'PU-128':1, 'B-DX09':1},
         'PU-128': {'VM-QFU':1, 'B-DX09':1, 'IEZW-V':1},
         'B-DX09': {'IEZW-V':1, 'PU-128':1, '1TS-WIN':1},
         '1TS-WIN': {'B-DX09':1, '16-31U':1},
         '16-31U': {'1TS-WIN':1}
        }

Now you can define a recursive function to navigate that data:

def find_all_paths(graph, start, end, path=[]):
        path = path + [start]
        if start == end:
            return [path]
        if start not in graph:
            return []
        paths = []
        for node in graph[start]:
            if node not in path:
                newpaths = find_all_paths(graph, node, end, path)
                for newpath in newpaths:
                    paths.append(newpath)
        return paths       

def min_path(graph, start, end):
    paths=find_all_paths(graph,start,end)
    mt=10**99
    mpath=[]
    print '\tAll paths:',paths
    for path in paths:
        t=sum(graph[i][j] for i,j in zip(path,path[1::]))
        print '\t\tevaluating:',path, t
        if t<mt: 
            mt=t
            mpath=path

    e1='\n'.join('{}->{}:{}'.format(i,j,graph[i][j]) for i,j in zip(mpath,mpath[1::]))
    e2=str(sum(graph[i][j] for i,j in zip(mpath,mpath[1::])))
    print 'Best path: '+e1+'   Total: '+e2+'\n'  

Now demo:

min_path(graph,'Q-KCK3','A3-PAT')
min_path(graph,'Q-KCK3','16-31U')

Prints:

    All paths: [['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'H55-2R', 'P6-DBM', 'A3-PAT'], ['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'H55-2R', 'A3-PAT'], ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'H55-2R', 'P6-DBM', 'A3-PAT'], ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'H55-2R', 'A3-PAT']]
        evaluating: ['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'H55-2R', 'P6-DBM', 'A3-PAT'] 7
        evaluating: ['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'H55-2R', 'A3-PAT'] 6
        evaluating: ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'H55-2R', 'P6-DBM', 'A3-PAT'] 8
        evaluating: ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'H55-2R', 'A3-PAT'] 7
Best path: Q-KCK3->3C-261:1
3C-261->U8MM-3:1
U8MM-3->Q8T-MC:2
Q8T-MC->H55-2R:1
H55-2R->A3-PAT:1   Total: 6

    All paths: [['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'IEZW-V', 'B-DX09', '1TS-WIN', '16-31U'], ['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'IEZW-V', 'PU-128', 'B-DX09', '1TS-WIN', '16-31U'], ['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'PU-128', 'B-DX09', '1TS-WIN', '16-31U'], ['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'PU-128', 'IEZW-V', 'B-DX09', '1TS-WIN', '16-31U'], ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'IEZW-V', 'B-DX09', '1TS-WIN', '16-31U'], ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'IEZW-V', 'PU-128', 'B-DX09', '1TS-WIN', '16-31U'], ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'PU-128', 'B-DX09', '1TS-WIN', '16-31U'], ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'PU-128', 'IEZW-V', 'B-DX09', '1TS-WIN', '16-31U']]
        evaluating: ['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'IEZW-V', 'B-DX09', '1TS-WIN', '16-31U'] 10
        evaluating: ['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'IEZW-V', 'PU-128', 'B-DX09', '1TS-WIN', '16-31U'] 11
        evaluating: ['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'PU-128', 'B-DX09', '1TS-WIN', '16-31U'] 11
        evaluating: ['Q-KCK3', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'PU-128', 'IEZW-V', 'B-DX09', '1TS-WIN', '16-31U'] 12
        evaluating: ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'IEZW-V', 'B-DX09', '1TS-WIN', '16-31U'] 11
        evaluating: ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'IEZW-V', 'PU-128', 'B-DX09', '1TS-WIN', '16-31U'] 12
        evaluating: ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'PU-128', 'B-DX09', '1TS-WIN', '16-31U'] 12
        evaluating: ['Q-KCK3', 'L-SDU7', '3C-261', 'U8MM-3', 'Q8T-MC', 'VM-QFU', 'PU-128', 'IEZW-V', 'B-DX09', '1TS-WIN', '16-31U'] 13
Best path: Q-KCK3->3C-261:1
3C-261->U8MM-3:1
U8MM-3->Q8T-MC:2
Q8T-MC->VM-QFU:2
VM-QFU->IEZW-V:1
IEZW-V->B-DX09:1
B-DX09->1TS-WIN:1
1TS-WIN->16-31U:1   Total: 10

If you want the minimum number of hops, just modify min_path to return the shortest list length rather than the minimum total cost of the hops. Or, make the cost of each hop 1.

Have a look at my previous answer regarding trains.

Community
  • 1
  • 1
dawg
  • 98,345
  • 23
  • 131
  • 206
  • Thank you for taking the time to actually manually copy part of that image for a working example. Kudos! However, I'm still confused as to how to determine the heuristics. Since each hop (g) costs 1, and I don't have any way to estimating h, what's the best way to determine this? – blitzmann Mar 29 '13 at 01:35
  • The algorithm here is the [Dijkstra's algorithm](http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm) which is similar to [A*](http://en.wikipedia.org/wiki/A*_search_algorithm) Maybe just set the heuristic to 0 as is done [here](http://brandon.sternefamily.net/files/astar.txt)? – dawg Mar 29 '13 at 01:45
  • This was my thought. A* with h=0 is still Dijkstra. Perhaps it's possible that A* is not what I'm looking for then, hence why I had a hard time coming to grips with it. Since I'm not looking for distances between nodes, only least amount of hops, maybe Dijkstra and A* are not what I'm looking for. Perhaps BFS instead... – blitzmann Mar 29 '13 at 01:52