5

BACKGROUND:

I'm trying to learn algorithms and java. Running a grid of 320x320, 100 trials runs about 5 times faster than a non-recursive Quick-Union implementation. However, beyond above a grid of about 400x400 (160,000 sites), I have stack overflow errors.

I understand that java is not optimized for tail-recursion (let alone non-tail recursion). However, I suppose there to be times when a recursive algorithm could be chosen over a non-recursive version because it can run faster and just as safe.

Please keep in mind I'm just learning this stuff, and my code may not be optimal. However, I'm including it to better understand my question.

THE QUESTION

What is the process to evaluate when a recursive algorithm can safely be used in a java application (given that it runs faster than a non-recursive alternative)?

STATS ON RECURSIVE VERSUS UNION-FIND IMPLEMENTATION

(note: The 2x Ratio is simply the previous current run-time divided by the previous run-time)

|-----------|-----------|------------|-------------|-------------|
|     N     | Recursive | Recursive  | Quick-Union | Quick-Union |
|  (sites)  |    time   |  2x Ratio  |    time     |  2x Ratio   |
|===========|===========|============|=============|=============|
|     196   |      35   |            |      42     |             |
|     400   |      25   |    0.71    |      44     |     1.05    |
|     784   |      45   |    1.80    |      46     |     1.05    |
|    1600   |     107   |    2.38    |      86     |     1.87    |
|    3136   |      48   |    0.45    |     113     |     1.31    |
|    6400   |      75   |    1.56    |     303     |     2.68    |
|   12769   |     183   |    2.44    |     858     |     2.83    |
|   25600   |     479   |    2.62    |    2682     |     3.13    |
|   51076   |    1253   |    2.62    |    8521     |     3.18    |
|  102400   |    4730   |    3.77    |   27256     |     3.20    |
|-----------|-----------|------------|-------------|-------------|

THE RECURSIVE CLASS

public class PercolateRecur implements Percolation {
  // the site has been opened for percolation but is not connected
  private final int OPEN = 0;
  // the site is not open for percolation (default state)
  private final int BLOCKED = -1;
  // the matrix that will be percolated. Values default to `BLOCKED = -1`
  // two sites that are connected together share the same value.
  private int[][] matrix;
  // the size of the sides of the matrix (1 to n)
  private int size;
  // whether water can flow from top to bottom of the matrix
  private boolean percolated;

  public PercolateRecur(int N) {
    percolated = false;
    size = N;
    initMatrix();
  }

  /**
   * initializes the matrix to default values
   */
  private void initMatrix() {
    matrix = new int[size+1][size+1];
    // open up the top of the matrix
    for (int x = 1; x < size+1; x++)
      matrix[x][0] = x;

    // set all values in matrix to closed
    for (int x = 1; x < size+1; x++)
      for (int y = 1; y < size+1; y++)
        matrix[x][y] = BLOCKED;
  }

  /**
   * indicates site (x,y) is a valid coordinate
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   * @return boolean
   */
  private boolean isValid(int x, int y) {
    return x > 0 && x < size+1 && y > 0 && y < size+1;
  }

  /**
   * returns value of site above (x,y)
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   * @return int value
   */
  private int above(int x, int y) {
    if (y <= 0)
      return BLOCKED;
    else
      return matrix[x][y-1];
  }

  /**
   * returns value of site below (x,y)
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   * @return int value
   */
  private int below(int x, int y) {
    if (y >= size)
      return BLOCKED;
    else
      return matrix[x][y+1];
  }

  /**
   * returns value of site left of (x,y)
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   * @return int value
   */
  private int left(int x, int y) {
    if (x <= 0)
      return BLOCKED;
    return matrix[x-1][y];
  }

  /**
   * returns value of site right of (x,y)
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   * @return int value
   */
  private int right(int x, int y) {
    if (x >= size)
      return BLOCKED;
    else
      return matrix[x+1][y];
  }

  /**
   * connects (x,y) to open adjacent sites
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   */
  private void connect(int x, int y) {
    if (isFull(x,y))
      return;
    if (above(x,y) > OPEN)
      matrix[x][y] = above(x, y);
    else if (below(x, y) > OPEN)
      matrix[x][y] = below(x, y);
    else if (left(x, y) > OPEN)
      matrix[x][y] = left(x, y);
    else if (right(x, y) > OPEN)
      matrix[x][y] = right(x, y);
    else if (matrix[x][y] == BLOCKED)
      matrix[x][y] = OPEN;
  }

  /**
   * recursively connects open sites in same group as (x,y)
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   */
  private void expand(int x, int y) {
    if (!isFull(x, y))
      return;
    if (above(x,y) == OPEN)
      openWith(x,y-1, matrix[x][y]);
    if (below(x,y) == OPEN)
      openWith(x,y+1, matrix[x][y]);
    if (left(x,y) == OPEN)
      openWith(x-1,y, matrix[x][y]);
    if (right(x,y) == OPEN)
      openWith(x+1,y, matrix[x][y]);
  }

  /**
   * opens a site (x,y) on the matrix
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   */
  public void open(int x, int y) {
    if (percolated || !isValid(x, y))
      return;
    connect(x, y);
    expand(x, y);
  }

  /**
   * opens a site with given value
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   * @param val value of point
   */
  private void openWith(int x, int y, int val) {
    matrix[x][y] = val;
    open(x, y);
  }

  /**
   * Returns whether site (x,y) is open
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   * @return true if not blocked
   */
    public boolean isOpen(int x, int y) {
    return matrix[x][y] > BLOCKED;
  }

  /**
   * Returns whether site (x,y) is full (connected to the top)
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   * @return true if is full
   */
  public boolean isFull(int x, int y) {
    return matrix[x][y] > OPEN;
  }

  /**
   * indicates whether site is blocked (not open)
   * @param x x-portion of x/y coordinate
   * @param y y-portion of x/y coordinate
   * @return true if blocked
   */
  public boolean isBlocked(int x, int y) {
    return matrix[x][y] == BLOCKED;
  }

  /**
   * indicates whether water can flow from top to bottom of matrix
   * @return true if matrix is percolated
   */
  public boolean percolates() {
    for (int x = 1; x <= size; x++)
      if (matrix[x][size] > OPEN)
        percolated = true;
    return percolated;
  }

  /**
   * prints the matrix to the command line
   */
  public void print() {
    for (int y = 1; y < size+1; y++) {
      System.out.println();
      for (int x = 1; x < size+1; x++) {
        if (matrix[x][y] == BLOCKED)
          System.out.print("XX ");
        else if (matrix[x][y] < 10)
          System.out.print(matrix[x][y] + "  ");
        else
          System.out.print(matrix[x][y] + " ");
      }
    }
    System.out.println();
  }
}
Alex
  • 1,470
  • 1
  • 11
  • 28
  • "safely be used" ? what do you mean by that ? – Nir Alfasi Feb 17 '13 at 04:19
  • I can't find the recursion in your 'recursive class'. – us2012 Feb 17 '13 at 04:22
  • @alfasin I mean that a recursive algorithm can potentially create a stack overflow error. – Alex Feb 17 '13 at 04:22
  • 1
    I'm not sure on this either. Is it when the number of recursive calls is not exponentially proportional to n? – kufudo Feb 17 '13 at 04:22
  • @us2012 the recursion is open -> expand -> openWith -> open – Alex Feb 17 '13 at 04:25
  • @Alex My bad, I missed that expand calls openWith. – us2012 Feb 17 '13 at 04:26
  • @kufudo if I run a monte carlo simulation with a grid above about 400x400, then I get a stack overflow error. – Alex Feb 17 '13 at 04:28
  • 1
    you can always increment the stack mem allocation with the -Xss flag – asermax Feb 17 '13 at 05:01
  • @kufudo I don't think it would be exponentially proportional to n. If that were the case, I should see significant slowing. I think it is probably some multiple of N * 59% (given that `open()` will be called N * 59% times plus `openWith()` will be called some multiple of N * 59% times). But I'm not sure how to calculate the number of times (on average) that openWith() would be called. – Alex Feb 17 '13 at 19:07

1 Answers1

3

One of the issues with implementing recursive algorithms in Java is that the Java platform doesn't do the standard "tail call elimination" optimization. This means that deep recursion requires a deep stack. And since Java thread stacks don't "grow", this means that you are vulnerable to stack overflow.

There are two workarounds:

  • Increase the thread stack size, either by using the -Xss option on the command line, or by explicitly provide a (larger) stack size in the Thread constructor.

  • Implement the tail call elimination explicitly in your Java code ... which is ugly, to say the least.

In your case, the first workaround will give you the measurement for "true recursion". The second ... well the algorithm won't be truly recursive any more, but that is what you might do to make a recursive algorithm practical in Java for "deep" problems.

Note: you can always convert a recursive algorithm in Java into an equivalent iterative one that uses a "simulated stack" data structure to hold the recursion state. The algorithmic complexity should be the same for both versions. Perhaps you should try that approach too, and include the "simulated stack" variant as a third pair of columns in your evaluation.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • what are some resources you would recommend to learn more about how to implement "simulated stack" data structures? – Alex Feb 17 '13 at 05:50
  • There is a very interesting link to a pdf about simulated stack there, but I'm wondering if a "simulated stack" would make the methods extremely difficult to read and maintain. – Alex Feb 17 '13 at 19:21
  • Yes it would. That's part of the problem. – Stephen C Feb 17 '13 at 22:54
  • If I understand you correctly, unless there is a very significant performance difference between the recursive and non-recursive option, you wouldn't choose the recursive version because either: (a) the developer has to spend time to investigate and defend against possible stack overflow errors; (b) refactor the code such that the recursion is safe and the code is impossible to read; or (c) use non-recursive algorithm. – Alex Feb 18 '13 at 00:19
  • @Alex - It is not that simple. There are also the cases where you know that the recursion is not that deep, or you can deal with the algorithm failing (with a stack overflow) when given pathological input. Or you can just use a big stack. But the fact remains that you need to be careful with recursive algorithms in Java. – Stephen C Feb 18 '13 at 01:08