I am trying to create a heightmap by interpolating between a bunch of heights at certain points in an area. To process the whole image, I have the following code snippet:
map_ = np.zeros((img_width, img_height))
for x in range(img_width):
for y in range(img_height):
map_[x, y] = calculate_height(set(points.items()), x, y)
This is calculate_height
:
def distance(x1, y1, x2, y2) -> float:
return np.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2)
def calculate_height(points: set, x, y) -> float:
total = 0
dists = {}
for pos, h in points:
d = distance(pos[0], pos[1], x, y)
if x == pos[0] and y == pos[1]:
return h
d = 1 / (d ** 2)
dists[pos] = d
total += d
r = 0
for pos, h in points:
ratio = dists[pos] / total
r += ratio * h
r = r
return r
This snippet works perfectly, but if the image is too big, it takes a long time to process, because this is O(n^2). The problem with this, is that a "too big" image is 800 600, and it takes almost a minute to process, which to me seems a bit excessive.
My goal is not to reduce the time complexity from O(n^2), but to reduce the time it takes to process images, so that I can get decently sized images in a reasonable amount of time.
I found
this post, but I couldn't really try it out because it's all for a 1D array, and I have a 2D array, and I also need to pass the coordinates of each point and the set of existing points to the calculate_height
function. What can I try to optimize this code snippet?
Edit: Moving set(points.items)
out of the loop as @thethiny suggested was a HUGE improvement. I had no idea it was such a heavy thing to do. This makes it fast enough for me, but feel free to add more suggestions for the next people to come by!
Edit 2: I have further optimized this processing by including the following changes:
# The first for loop inside the calculate_distance function
for pos, h in points:
d2 = distance2(pos[0], pos[1], x, y)
if x == pos[0] and y == pos[1]:
return h
d2 = d2 ** -1 # 1 / (d ** 2) == d ** -2 == d2 ** -1
dists[pos] = d2 # Having the square of d on these two lines makes no difference
total += d2
This reduced execution time for a 200x200 image from 1.57 seconds to 0.76 seconds. The 800x600 image mentioned earlier now takes 6.13 seconds to process :D
This is what points
looks like (as requested by @norok12):
# Hints added for easier understanding, I know this doesn't run
points: dict[tuple[int, int], float] = {
(x: int, y: int): height: float,
(x: int, y: int): height: float,
(x: int, y: int): height: float
}
# The amount of points varies between datasets, so I can't provide a useful range other than [3, inf)