We have a scenario where we get tons of urls from customers, the urls are organized in arbitrary levels like : xxx.com/levelA/levelB/levelC/...levels.../xxxx we are trying to use this data and build a query system that can answer what urls are under any given level. for example, getAll("abc.com/test/sub/"), should give me all the urls that's been recorded that has "abc.com/test/sub/" as prefix, abc.com/test/sub/a.data, abc.com/test/sub/sub2/data etc.
This appears to be similar to a file directory structure. My question is, is there any existing open source project that can help handle such scenario. requirement is :
- real time system.
- high write/read throughput.
- distributed and reliable.