I have a 50TB set of ~1GB tiff images that I need to run the same algorithm on. Currently, I have the rectification process written in C++ and it works well, however it will take forever to run on all these images consecutively. I understand that an implementation of MapReduce/Spark could work, but I can't seem to figure out how to use image input/output.
Every tutorial/example that I've seen uses plain text. In theory, I would like to utilize Amazon Web Services too. If anyone has some direction for me, that would be great. I'm obviously not looking for a full solution, but maybe someone has successfully implemented something close to this? Thanks in advance.