3

I need to find roots for a generalized state space. That is, I have a discrete grid of dimensions grid=AxBx(...)xX, of which I do not know ex ante how many dimensions it has (the solution should be applicable to any grid.size) .

I want to find the roots (f(z) = 0) for every state z inside grid using the bisection method. Say remainder contains f(z), and I know f'(z) < 0. Then I need to

  • increase z if remainder > 0
  • decrease z if remainder < 0

Wlog, say the matrix historyof shape (grid.shape, T) contains the history of earlier values of z for every point in the grid and I need to increase z (since remainder > 0). I will then need to select zAlternative inside history[z, :] that is the "smallest of those, that are larger than z". In pseudo-code, that is:

zAlternative =  hist[z,:][hist[z,:] > z].min()

I had asked this earlier. The solution I was given was

b = sort(history[..., :-1], axis=-1)
mask = b > history[..., -1:]
index = argmax(mask, axis=-1)
indices = tuple([arange(j) for j in b.shape[:-1]])
indices = meshgrid(*indices, indexing='ij', sparse=True)
indices.append(index)
indices = tuple(indices)
lowerZ = history[indices]

b = sort(history[..., :-1], axis=-1)
mask = b <= history[..., -1:]
index = argmax(mask, axis=-1)
indices = tuple([arange(j) for j in b.shape[:-1]])
indices = meshgrid(*indices, indexing='ij', sparse=True)
indices.append(index)
indices = tuple(indices)
higherZ = history[indices]

newZ = history[..., -1]
criterion = 0.05
increase = remainder > 0 + criterion
decrease = remainder < 0 - criterion
newZ[increase] = 0.5*(newZ[increase] + higherZ[increase])
newZ[decrease] = 0.5*(newZ[decrease] + lowerZ[decrease])

However, this code ceases to work for me. I feel extremely bad about admitting it, but I never understood the magic that is happening with the indices, therefore I unfortunately need help.

What the code actually does, it to give me the lowest respectively the highest. That is, if I fix on two specific z values:

history[z1] = array([0.3, 0.2, 0.1])
history[z2] = array([0.1, 0.2, 0.3])

I will get higherZ[z1] = 0.3 and lowerZ[z2] = 0.1, that is, the extrema. The correct value for both cases would have been 0.2. What's going wrong here?

If needed, in order to generate testing data, you can use something along the lines of

history = tile(array([0.1, 0.3, 0.2, 0.15, 0.13])[newaxis,newaxis,:], (10, 20, 1))
remainder = -1*ones((10, 20))

to test the second case.

Expected outcome

I adjusted the history variable above, to give test cases for both upwards and downwards. Expected outcome would be

lowerZ = 0.1 * ones((10,20))
higherZ = 0.15 * ones((10,20))

Which is, for every point z in history[z, :], the next highest previous value (higherZ) and the next smallest previous value (lowerZ). Since all points z have exactly the same history ([0.1, 0.3, 0.2, 0.15, 0.13]), they will all have the same values for lowerZ and higherZ. Of course, in general, the histories for each z will be different and hence the two matrices will contain potentially different values on every grid point.

Community
  • 1
  • 1
FooBar
  • 15,724
  • 19
  • 82
  • 171
  • Could you clarify what you mean by "to find the roots (`f(z) = 0`) for every state `z` inside `grid`"? Do you mean that `f` is a function of an additional variable, that is you want to find `φ(z)` such that `f(z, φ(z)) = 0` for any `z`, or do you want to find the set of `z ∈ grid` for which `f` evaluates to zero, or do you only want to find _a_ root within `grid`? – Phillip Jun 10 '14 at 14:13
  • Does ```history``` contain each *guess* of ```z``` that has been tried in the bisection algorithm? For the testing data you spec'd, ```history.shape``` is (10,20,3) - does that represent 10 guesses of ```z``` where ```z.shape``` is (20,3)? – wwii Jun 10 '14 at 15:57
  • @wwii: the history for every `z` is in the last dimension, `-1`. Hence, we have `10x20` data that all have `3` observations: `(0.1, 0.2, 0.3)`. Given that `remainder < 0`, for every observation in that `10x20` data set, we need to find the "next smallest value" - `0.2` – FooBar Jun 10 '14 at 16:16
  • @Phillip: I am excluding `f` from the code given, I am only curious about the updating mechanism. In the example given, `remainder` will contain a `10x20` matrix that indicates whether the `grid` values need to be updated upwards or downwards. I am interested in finding the "next highest" or "next smallest" value inside `history` - the matrices `lowerZ` and `higherZ` in the code snippet provided. – FooBar Jun 10 '14 at 16:18
  • @FooBar: Do I get you correctly: Given a grid `z` and an arbitrary index `i` and a history array of grids `H`, you want to find `min([ H[k][i] for k in len(H) if H[k][i] > z[i]])`, only for all `i` and in an efficient manner? – Phillip Jun 10 '14 at 17:24
  • That is my goal, correct. However, note, that my shape of `H`, as you call it, is `z`x"length of history" `+ 2`. This will not be relevant for the answer, but so you know what I do: I provide the global minimum and maximum for the bisection in `(z, 0)` and `(z, 1)`, and the initial guess in `(z, 2)`, for every `z` in my `grid`. Then, with every iteration `i` of the bisection, the next guess will be stored `(z, i+2)`. Therefore, the shape of my `history` (your `H`) is `(grid.shape, h)`, where `h` is any integer `> 2`. – FooBar Jun 10 '14 at 17:45
  • ```history``` contains **all** *guesses* that have been made, ```history[...,2:]```, as well as the global min and max? With this test data, what do expect newZ to be? – wwii Jun 10 '14 at 18:02
  • @wwii see my earlier comment http://stackoverflow.com/questions/24098205/gridwise-application-of-the-bisection-method?noredirect=1#comment37260127_24098205 – FooBar Jun 10 '14 at 18:38
  • I extended the code to generate the sample code and also wrote a short paragraph about expected outcome. – FooBar Jun 10 '14 at 18:49
  • I'm curious - why are you keeping a history of guesses instead of just hi, low, current? – wwii Jun 10 '14 at 23:51
  • I'll look into your answers tomorrow when I'm in my office. If it's really up to doing `>` instead of `>=`, I'll be quite mad with myself. For your curiosity, that's mainly for logging. Want to see afterwards the actual speed of convergence (for a larger neighbourhood), in order to perhaps tweak performance at cost of the precision window. – FooBar Jun 11 '14 at 10:58

2 Answers2

1

I compared what you posted here to the solution for your previous post and noticed some differences.

For the smaller z, you said

mask = b > history[..., -1:]
index = argmax(mask, axis=-1)

They said:

mask = b >= a[..., -1:]
index = np.argmax(mask, axis=-1) - 1

For the larger z, you said

mask = b <= history[..., -1:]
index = argmax(mask, axis=-1)

They said:

mask = b > a[..., -1:]
index = np.argmax(mask, axis=-1)

Using the solution for your previous post, I get:

import numpy as np
history = np.tile(np.array([0.1, 0.3, 0.2, 0.15, 0.13])[np.newaxis,np.newaxis,:], (10, 20, 1))
remainder = -1*np.ones((10, 20))

a = history

# b is a sorted ndarray excluding the most recent observation
# it is sorted along the observation axis
b = np.sort(a[..., :-1], axis=-1)

# mask is a boolean array, comparing the (sorted)
# previous observations to the current observation - [..., -1:]
mask = b > a[..., -1:]

# The next 5 statements build an indexing array.
# True evaluates to one and False evaluates to zero.
# argmax() will return the index of the first True,
# in this case along the last (observations) axis.
# index is an array with the shape of z (2-d for this test data).
# It represents the index of the next greater
# observation for every 'element' of z.
index = np.argmax(mask, axis=-1)

# The next two statements construct arrays of indices
# for every element of z - the first n-1 dimensions of history.
indices = tuple([np.arange(j) for j in b.shape[:-1]])
indices = np.meshgrid(*indices, indexing='ij', sparse=True)

# Adding index to the end of indices (the last dimension of history)
# produces a 'group' of indices that will 'select' a single observation
# for every 'element' of z
indices.append(index)
indices = tuple(indices)
higherZ = b[indices]

mask = b >= a[..., -1:]
# Since b excludes the current observation, we want the
# index just before the next highest observation for lowerZ,
# hence the minus one.
index = np.argmax(mask, axis=-1) - 1
indices = tuple([np.arange(j) for j in b.shape[:-1]])
indices = np.meshgrid(*indices, indexing='ij', sparse=True)
indices.append(index)
indices = tuple(indices)
lowerZ = b[indices]
assert np.all(lowerZ == .1)
assert np.all(higherZ == .15)

which seems to work

Community
  • 1
  • 1
wwii
  • 23,232
  • 7
  • 37
  • 77
  • This is much faster than the [sliding_window solution](http://stackoverflow.com/a/24152353/2823755). – wwii Jun 11 '14 at 05:22
  • I went mad. Like, it can't be the `>` vs `>=`. And it wasn't (at least not only). `higherZ = b[indices]` in your code, while in mine it was `higherZ = history[indices]`. – FooBar Jun 11 '14 at 13:51
  • I had to figure out how that indexing works, hopefully I will remember it when I need it. I added comments to the code. – wwii Jun 11 '14 at 16:05
  • A bit of disclaimer - Although I was able to figure out how it works, I could not have come up with that on my own. – wwii Jun 11 '14 at 16:24
0

z-shaped arrays for the next highest and lowest observation in history relative to the current observation, given the current observation is history[...,-1:]

This constructs the higher and lower arrays by manipulating the strides of history to make it easier to iterate over the observations of each element of z. This is accomplished using numpy.lib.stride_tricks.as_strided and an n-dim generalzed function found at Efficient Overlapping Windows with Numpy - I will include it's source at the end

There is a single python loop that has 200 iterations for history.shape of (10,20,x).

import numpy as np

history = np.tile(np.array([0.1, 0.3, 0.2, 0.15, 0.13])[np.newaxis,np.newaxis,:], (10, 20, 1))
remainder = -1*np.ones((10, 20))

z_shape = final_shape = history.shape[:-1]
number_of_observations = history.shape[-1]
number_of_elements_in_z = np.product(z_shape)

# manipulate histories to efficiently iterate over
# the observations of each "element" of z
s = sliding_window(history, (1,1,number_of_observations))
# s.shape will be (number_of_elements_in_z, number_of_observations)

# create arrays of the next lower and next higher observation
lowerZ = np.zeros(number_of_elements_in_z)
higherZ = np.zeros(number_of_elements_in_z)
for ndx, observations in enumerate(s):
    current_observation = observations[-1]
    a = np.sort(observations)
    lowerZ[ndx] = a[a < current_observation][-1]
    higherZ[ndx] = a[a > current_observation][0]

assert np.all(lowerZ == .1)
assert np.all(higherZ == .15)
lowerZ = lowerZ.reshape(z_shape)
higherZ = higherZ.reshape(z_shape)

sliding_window from Efficient Overlapping Windows with Numpy

import numpy as np
from numpy.lib.stride_tricks import as_strided as ast
from itertools import product

def norm_shape(shape):
    '''
    Normalize numpy array shapes so they're always expressed as a tuple, 
    even for one-dimensional shapes.

    Parameters
        shape - an int, or a tuple of ints

    Returns
        a shape tuple

    from http://www.johnvinyard.com/blog/?p=268
    '''
    try:
        i = int(shape)
        return (i,)
    except TypeError:
        # shape was not a number
        pass

    try:
        t = tuple(shape)
        return t
    except TypeError:
        # shape was not iterable
        pass

    raise TypeError('shape must be an int, or a tuple of ints')


def sliding_window(a,ws,ss = None,flatten = True):
    '''
    Return a sliding window over a in any number of dimensions

    Parameters:
        a  - an n-dimensional numpy array
        ws - an int (a is 1D) or tuple (a is 2D or greater) representing the size 
             of each dimension of the window
        ss - an int (a is 1D) or tuple (a is 2D or greater) representing the 
             amount to slide the window in each dimension. If not specified, it
             defaults to ws.
        flatten - if True, all slices are flattened, otherwise, there is an 
                  extra dimension for each dimension of the input.

    Returns
        an array containing each n-dimensional window from a

    from http://www.johnvinyard.com/blog/?p=268
    '''

    if None is ss:
        # ss was not provided. the windows will not overlap in any direction.
        ss = ws
    ws = norm_shape(ws)
    ss = norm_shape(ss)

    # convert ws, ss, and a.shape to numpy arrays so that we can do math in every 
    # dimension at once.
    ws = np.array(ws)
    ss = np.array(ss)
    shape = np.array(a.shape)


    # ensure that ws, ss, and a.shape all have the same number of dimensions
    ls = [len(shape),len(ws),len(ss)]
    if 1 != len(set(ls)):
        error_string = 'a.shape, ws and ss must all have the same length. They were{}'
        raise ValueError(error_string.format(str(ls)))

    # ensure that ws is smaller than a in every dimension
    if np.any(ws > shape):
        error_string = 'ws cannot be larger than a in any dimension. a.shape was {} and ws was {}'
        raise ValueError(error_string.format(str(a.shape),str(ws)))

    # how many slices will there be in each dimension?
    newshape = norm_shape(((shape - ws) // ss) + 1)
    # the shape of the strided array will be the number of slices in each dimension
    # plus the shape of the window (tuple addition)
    newshape += norm_shape(ws)
    # the strides tuple will be the array's strides multiplied by step size, plus
    # the array's strides (tuple addition)
    newstrides = norm_shape(np.array(a.strides) * ss) + a.strides
    strided = ast(a,shape = newshape,strides = newstrides)
    if not flatten:
        return strided

    # Collapse strided so that it has one more dimension than the window.  I.e.,
    # the new array is a flat list of slices.
    meat = len(ws) if ws.shape else 0
    firstdim = (np.product(newshape[:-meat]),) if ws.shape else ()
    dim = firstdim + (newshape[-meat:])
    # remove any dimensions with size 1
    dim = filter(lambda i : i != 1,dim)
    return strided.reshape(dim)
wwii
  • 23,232
  • 7
  • 37
  • 77