I have a giant lists of query searches with cached image results for a few different servers and I want to sync the queries efficiently. I know that one way would be to do it in two steps. First comparing the queries, and second, only syncing non-identical results. Instead though I'd like it to be faster and more efficient by only exchanging a small fixed amount of data and then syncing non-identical results based on that data (it's fine if it happens to sync a small amount of identical results).
What kind of data structure for these queries would be recommended to accomplish this? I've been looking at https://en.wikipedia.org/wiki/List_of_data_structures to try to get a better idea, but I don't have a lot of experience in algorithms so I could really use some direction. I'm planning to do this in C++ if that needs to be taken into consideration. All suggestions appreciated, thanks.