I am using Apache Geode as a caching solution. I have a requirement to store data within 2 different regions and retrieve them with a simple join query.
I have tried both replicated as well as partitioned regions but have found that the query takes a long time to return results. I have added indexes on the both regions as well which has improved the performance but is still not fast enough. Can someone please help on how to improve the performance on this query.
Here is what I have tried
Example 1 - PARTITIONED REGIONS
Time taken to retrieve about 7300 records from the cache was 36 seconds
Configuration in cache.xml
<region name="Department">
<region-attributes>
<partition-attributes redundant-copies="1">
</partition-attributes>
</region-attributes>
<index name="deptIndex" from-clause="/Department" expression="deptId"/>
</region>
<region name="Employee">
<region-attributes>
<partition-attributes redundant-copies="1" colocated-with="Department">
</partition-attributes>
</region-attributes>
<index name="empIndex" from-clause="/Employee" expression="deptId"/>
</region>
QueryFunction
@Override
public void execute(FunctionContext context) {
// TODO Auto-generated method stub
Cache cache = CacheFactory.getAnyInstance();
QueryService queryService = cache.getQueryService();
ArrayList arguments = (ArrayList)context.getArguments();
String queryStr = (String)arguments.get(0);
Query query = queryService.newQuery(queryStr);
try {
SelectResults result = (SelectResults)query.execute((RegionFunctionContext)context);
ArrayList arrayResult = (ArrayList)result.asList();
context.getResultSender().sendResult(arrayResult);
context.getResultSender().lastResult(null);
} catch (FunctionDomainException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TypeMismatchException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (NameResolutionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (QueryInvocationTargetException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Executing the function
Function function = new QueryFunction();
String queryStr = "SELECT * FROM /Department d, /Employee e WHERE d.deptId=e.deptId";
ArrayList argList = new ArrayList();
argList.add(queryStr);
Object result = FunctionService.onRegion(CacheFactory.getAnyInstance().getRegion("Department")).withArgs(argList).execute(function).getResult();
ArrayList resultList = (ArrayList)result;
ArrayList<StructImpl> finalList = (ArrayList)resultList.get(0);
Example 2 - REPLICATED REGIONS
Time taken to retrieve about 7300 records from cache was 29 seconds
Configuration in cache.xml
<region name="Department">
<region-attributes refid="REPLICATE">
</region-attributes>
<index name="deptIndex" from-clause="/Department" expression="deptId"/>
</region>
<region name="Employee">
<region-attributes refid="REPLICATE">
</region-attributes>
<index name="empIndex" from-clause="/Employee" expression="deptId"/>
</region>
Query
@Override
public SelectResults fetchJoinedDataForIndex() {
QueryService queryService = getClientcache().getQueryService();
Query query = queryService.newQuery("SELECT * FROM /Department d, /Employee e WHERE d.deptId=e.deptId");
SelectResults result = null;
try {
result = (SelectResults)query.execute();
System.out.println(result.size());
} catch (FunctionDomainException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TypeMismatchException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (NameResolutionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (QueryInvocationTargetException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return result;
}