I need to bulk-load all entities in a table. (They need to be in memory rather than loaded as-needed, for high-speed on-demand graph-traversal algorithms.)
I need to parallelize this for speed in loading. So, I want to run multiple queries in parallel threads, each pulling approx. 800 entities from the database.
QuerySplitter serves this purpose, but we are running on Flexible Environment and so are using the Appengine SDK rather than the Client libraries.
MapReduce has been mentioned, but that is not aimed at simple dataloading into memory. Memcache is somewhat relevant, but for high speed access I need all these objects in a dense network in the RAM of my own app's JVM.
MultiQueryBuilder might do this. It offers parallelism in running parts of a query in parallel.
Whichever of these three approaches, or some other approach, is used, the hardest part is to define filters or some other form of spilts that roughly partition the table (the Kind) into chunks of 800 or so entities? I would create filters that say "objects 1 through 800", "801 through 1600,...", but I know that that is impractical. So, how does one do it?