5

I'm currently working on a homework assignment to implement the Bellman-Ford algorithm. So far, I've managed to read in the provided graph, place it into a vector (using a 1d vector to represent a 2d one with row-major order) to use as a matrix. I'm using a struct that keeps track of the weight of the edge, a boolean for whether or not it's infinity (no edge exists) and a variable to keep track of the preceeding node.

What I'm stumped by is actually implementing the dang algorithm. I've read the pseudocode from http://en.wikipedia.org/wiki/Bellman%E2%80%93Ford_algorithm but I'm having a difficult time grasping how to use the algorithm. The first 80 lines are reading in the file, initializing the vector, tossing the values into the vector in the right place. Below that is what I've started implementing for the algorithm. I do have a few specific questions.

1) In all the explanations of the algorithm I've found, you work the algorithm # of nodes - 1 times. In a few of the implementations of this I've looked at, i is tended to start at 1. Why is this? Further, with my implementation, does that still make sense?

2) Further in the wikipedia pseudocode, it says to check each edge u,v, with u being the start vertex and v being the end vertex. In my matrix, as near as I can understand that would mean I need to check the weight/value of each row,column pair and see if there's a better path. I'm...not sure if I'm explaining that correctly or even understanding it as this point. Any advice/guidance/hints/demonstrations would be greatly appreciated. Source code followed by a paste of the instructor's demo input is below.

#include <fstream>
#include <iostream>
#include <iomanip>
#include <vector>

using namespace std;

struct graphNode
{
    int value; //Weight of the edge
    bool isInfinity; //Boolean to flag an edge as infinity
    int pred; //predecessor node
};

// Code for reading inputfile cribbed and modified from http://stackoverflow.com/questions/7651243/c-read-a-file-name-from-the-command-line-and-utilize-it-in-my-file
int main(int argc, char** argv) 
{
    ifstream input; // ifstream for the input
    string inFile = ""; //name of the input file
    int row; //Variable to keep track of what row we're inputting data for
    int col; //Variable to keep track of a column thingie, expand on this later
    int infinity = 99999999;
    int nodeCount; //Number of nodes from input file
    int edgeCount = 0; //Number of edges from the input file

    vector<graphNode> edgeList; //2D list of edges, order is a->b
    edgeList.push_back(graphNode());
    edgeList[0].value = 0;
    edgeList[0].isInfinity = false;
    edgeList[0].pred = -1;

    if( argc == 2 ) 
    {
        inFile = argv[1];
    }
    else 
    {
        cout << "Usage: ./a.out inputFile\n";
        return 1;
    }

    input.open(inFile.c_str()); // opening the provided file

    if(input.is_open()) // making sure the input is open
    {
        input >> nodeCount; //Grabbing the number of nodes from the first value of the file

        for(int i = 1; i < nodeCount*nodeCount; i++)
        {
            edgeList.push_back(graphNode());
            edgeList[i].value = infinity;
            edgeList[i].isInfinity = true;
            edgeList[i].pred = -1;
        }

        //Putting data from the file into the vector array
        while(!input.eof())
        {
            input >> row; //For each cycle through the list, we grab the first number on the line to get which x value (start vertex) we're working with
            while(input.peek() != '\n' && input.peek() != '\r' && !input.eof())
            {
                input >> col;
                input >> edgeList[((row-1)*nodeCount)+(col-1)].value;
                edgeList[((row-1)*nodeCount)+(col-1)].isInfinity = false;
                edgeList[((row-1)*nodeCount)+(col-1)].pred = row;
                edgeCount++;
            }

        }
        input.close(); //Closing our input file since we don't need it anymore
    }
    else
    {
        cout << "Error, something happened with the input." << endl;
        return 1;
    }

    //for(int i = 0; i < nodeCount - 1; i++)
    //{
        //for(int r = 0; r < nodeCount - 1; r++)
        //{
            //for(int c = 0; r < nodeCount - 1; c++)
            //{
                //if(r == c) continue;
                //if(edgeList[r][c].isInfinity) continue;
                //if(edgeList[i][r] + edgeList[r][c] < edgeList[c][i])
}

Demo input:

10
3 6 4 9 0 7 8
8 5 3 7 3 4 -2
5 10 2 8 1 4 1
2 6 -3 1 3 7 1
1 10 -1 2 2 4 -2
10 9 -3 1 3 7 2 5 1
7 3 0 10 1 2 1 8 2
9 6 6 3 4 10 7
4 8 5 1 9 5 6
6 2 4 3 0 9 0
Ron Holcomb
  • 101
  • 4
  • 10
  • 1
    Only slightly related: Change this: `while(!input.eof())` to this `while (input >> row)` and remove the extraction immediately inside the loop. You could likewise refine this to be significantly cleaner using `std::getline()` and an intermediate `istringstream` for that handling loop per row. – WhozCraig Apr 16 '13 at 22:55
  • For what it's worth, here's a link to my Python implementation, maybe it will help: https://github.com/ezod/hypergraph/blob/master/hypergraph/path.py#L58 – ezod Apr 16 '13 at 23:27

2 Answers2

2

For #1, you're examining the edges between nodes such that the longest path cannot be more than nodeCount-1 edges. For example, if nodeCount==1, then there is no need to examine any edges.

For #2, you have interesting data structures. It appears you need different ones. Your 'graphNode' ought to be called a 'graphEdge' but without the 'pred' variable. Declare two new structures:

vector<int> predecessor;
vector<int> distance;

Each is of size nodeCount. The 'w' that you see in Wikipedia is your edgeList.

In the last section that you commented out, the outer loop is iterating nodeCount times. It ought to be nodeCount-1, but no harm. Your indexing later is off however since you've got a one dimensional edgeList that you're treating as two dimensional. You can access the correct edge via edgeList[(r-1)*nodeCount + c].

Not sure if this counts as an answer, but it's a start.

dan
  • 982
  • 8
  • 24
2

Check out the short videos on Bellman-Ford algorithm here. I think it may help you understand the algorithm better: https://class.coursera.org/algo2-2012-001/lecture/preview

Basically the main idea behind Bellman-Ford is this:

To find a shortest path between 2 nodes, say s and t, you iteratively find the shortest path to get from s to t, and in each subsequent iteration, allow the algorithm to use 1 more edge in the path than in the previous iteration.

At any particular iteration k, where you now allow the algorithm to use at most k edges in a path, the shortest path between s and t could either

  1. improve by using exactly k edges or
  2. take the same value as that found from the previous iteration i.e. usea at most (k - 1) edges.

Thus, in a particular iteration k:

Let d[ k ][ t ] be the distance from s to a node t using at most k edges. Then:

d[ k ][ t ] = min{ 
d[k - 1][ t ], # Case 2
for all edges (u, t) in graph min d[k - 1][ u ] + c[ u ][ t ] # Case 1
}

where the second part of the min{ ., .} equation above simply finds the shortest path between s to any neighbor u of the final destination t and add the cost of the edge c[ u ][ t ] to get from u to t, thereby requiring exactly k edges.

The pseudocode thus looks like this:

d[s][0] = 0 # The cost from s to itself requiring zero edges is 0.
d[u][0] = INFINITY for all other nodes other than s (i.e. you cannot reach them with no edges).

Let n = number of nodes in graph
for k = 1 to n - 1: # Allow one more edge in path in each iteration
  for every edge (u, v) in edges of graph: # Check whether can get a better cost to v using k edges by going through node u or k - 1 edges is enough.
    d[ v ][ k ] = min{ d[k - 1][ v ], d[k - 1][ u ] + c[ u ][ v ] }

To answer part 1 of your question, consider that the outer loop of the equation loops over the number of edges. It increases from 1 to n - 1, where n is the number of nodes in the graph. The reason it is up to n - 1 is that the maximum number of edges you can have in a path is (n - 1) edges (i.e. you end up connecting all n nodes to get from s to t, with n - 1 edges between them).

To answer part 2 of your question, you only need to consider the incoming nodes to every destination node and check if a path to one of those nodes using k - 1 edges PLUS the cost from that node to the destination node cost less than before, as I have explained above.

Lastly, to check for negative cycle (the last part in the wiki code), you run the algorithm for 1 more iteration. We know that any path can use at most n - 1 edges. Any more would be redundant. So if the cost of any path decreases when you allow to use 1 more edges, it must have included a negative cycle because that is the only way its cost could have decreased. Any non-negative cycle would have resulted in the same or greater cost as it used more edges.

I seriously recommend watching Tim Roughgarden's video in the link above. Note that his treatment is slightly different from the pseudocode in wikipedia but the idea is essentially the same.

lightalchemist
  • 10,031
  • 4
  • 47
  • 55