I have huge lists (>1,000,000 in most cases) that are partitioned into n x n
grids. These lists are of some coordinates. I define a neighbouring relation between points - if they are within range of each other, they are 'neighbours' and put into a 'cluster'. We can add cells to these clusters, so cell A is a neighbour of cell B, B a neighbour of C, C is not a neighbour of A. A, B, and C would be in the same cluster.
Given that background, I have the following code that tries to assign points to clusters in Python 3.6:
for i in range(n):
for j in range(n):
tile = image[i][j]
while tile:
cell = tile.pop()
cluster = create_and_return_cluster(cell, image_clusters[i][j], (i, j))
clusterise(cell, cluster, (i, j), image, n, windows, image_clusters)
def clusterise(cell, cluster, tile_index, image, n, windows, image_clusters):
neighbouring_windows, neighbouring_indices = get_neighbouring_windows(tile_index[0], tile_index[1], n, windows)
neighbours = get_and_remove_all_neighbours(cell, image, tile_index, neighbouring_windows, neighbouring_indices)
if neighbours:
for neighbour, (n_i, n_j) in neighbours:
add_to_current_cluster(cluster, image_clusters, neighbour, (n_j, n_j))
clusterise(neighbour, cluster, (n_i, n_j), image, n, windows, image_clusters)
Because of the massive size of the lists, I've had issues with RecursionError and have been scouring the internet for suggestions on tail-recursion. The problem is that this algorithm needs to branch from neighbours to pick up neighbours of those neighbours, and so on. As you can imagine, this gets pretty big pretty quickly in terms of stack frames.
My question is: is it possible to make this algorithm use tail-recursion, or how would one go about making it tail-recursive? I know the cluster
argument is essentially an accumulator in this case, but given how the lists shrink and the nasty for-loop in clusterise()
I am not sure how to successfully convert to tail-recursion. Does anyone have any ideas? Ideally supported with an explanation.
NB: I am well aware that Python does not optimise tail-recursion by default, yes I am aware that other languages optimise tail-recursion. My question is whether it can be done in this case using Python. I don't want to change if I don't absolutely have to and much of my other code is already in Python.