We are experiencing below typical issue which happens due to very huge data size stored in RDBMS :
- We have approximately 8M entries in database
- We are facing performance issues on UI where sometimes page loading takes 10/15 minutes
- After analysing the queries running , it is evident that two queries are taking too much of time (p 99 latency is 14 minutes and 20 minutes )
- Out of which one query seems to be :
count(*)
query which is required to scan entire database records and taking 14 minutes - The count is required because we are using pagination at server side (using Spring Data JPA) and using Page queries of Spring Data JPA
Below workarounds have been proposed as a part of technical discussion -
- Using Slice queries instead of Page queries (reference : Page<> vs Slice<> when to use which?) so that count will not be required to be calculated saving count query
- Increasing instance size (vertical scaling )
But my query is :
Considering such a huge database size (and more entries will be added in future years with data increasing linearly) , is it not more appropriate to use DATABASE SHARDING instead of above techniques ?
I agree that it will not be straight forward and considerable pain , but I think that is the most appropriate way to go forward and we can implement the pagination by using one of the technique as mentioned here (How do you implement sorting and paging on distributed data?)
Can you please provide any guidance here and any thoughts on the same please ?