0

I have an requirement to maintain big data sets in a single database and it should support both OLTP & OLAP workloads, I saw TiDB it will support HTAP workloads but we need to maintain data in TiKV & TiFlash to achieve full HTAP solution, since two modules data duplication causes more storage utilization, Can you please help,

  1. TiKV is sufficient for both OLTP and OLAP workloads?
  2. What is the compression rate both TiKV & TiFlash supports?
  3. Any TiDB benchmark with HTAP workloads.
  4. Can we maintain data replicated as 3 copies includes TiKV & TiFlash to get full data HA?
  5. I saw TiSpark it will execute direct on TiKV for OLAP, Can i get benchmark w.r.t TiSpark vs TiFlash for OLAP workloads.

Thanks,

Ajay Babu Maguluri.

Ajay
  • 47
  • 4

1 Answers1

0
  1. You can use only TiKV and run OLAP queries, but then this won't have the same performance as TiFlash would give you.
  2. TiKV uses RocksDB to store data on disk, This provides efficient use of storage. The actual compression rate depends on the data you're storing.
  3. There are some benchmarks on the PingCAP website. But I would recommend testing with your specific workload.
  4. TiKV needs 3 copies to be redundant. On a per-table basis you can add one or more replicas on TiFlash. It is recommended to use two replicas on TiFlash to be redundant. This would give you a total of 5 copies for the tables where you need TiFlash and 3 copies for tables that only use TiKV.
  5. Note that TiSpark is only supported if you deploy TiDB yourself and isn't supported with TiDB Cloud. See https://github.com/pingcap/tispark/wiki/TiSpark-Benchmark for benchmarking info. But here I would also recommend to test/benchmark for your specific workload instead of a generic workload.