Why are Reads/sec very less in aerospike then as compared to Write/sec?

Question

I am using aerospike v4.8 and i am making read and write requests to aerospike where in my write request i am getting a throughput of 4000 writes/sec whereas in reads the throughput is only 10-15 reads/sec which is very low.

My query is:

let query = aerospikeClient.query(nameSpace, set)
        query.select('count', 'targetKey')
        query.predexp = [
            predexp.stringBin('campaignKey'),
            predexp.stringValue(Id1 + ':' + Id2 + ':' + Id3 + ':' + channel),
            predexp.stringEqual(),

            predexp.integerBin('epochDay'),
            predexp.integerValue(epochDay),
            predexp.integerGreaterEq(),

            predexp.integerBin('epochDay'),
            predexp.integerValue(epochDay),
            predexp.integerLessEq(),

            predexp.and(3)
        ]

Not able to understand what is wrong here, help needed.

My Config is:

namespace test {
        replication-factor 2
        memory-size 8G
        default-ttl 7d 
        storage-engine device {
                device /dev/xvdf
                scheduler-mode noop
                write-block-size 16K
                data-in-memory false
        }
}

Indexes are:

CREATE INDEX campaignIndex ON antiSpamming.userTargetingMatrix (campaignKey) string;
CREATE INDEX targetIndex ON antiSpamming.userTargetingMatrix (targetKey) string;
CREATE INDEX epochDayIndex ON antiSpamming.userTargetingMatrix (epochDay) NUMERIC;

how did you get the above config? – YetAnotherBot Mar 31 '21 at 10:33 — YetAnotherBot, Mar 31 '21 at 10:33

score 3 · Accepted Answer · answered Jan 28 '20 at 17:04

3

First thing, that's not true at all. Aerospike reads are always going to be faster than writes. To perform a write there's a longer code path and more IO. Unless you are stating that your operation is a REPLACE it will behave as an upsert, meaning that it will first try to read the same record, merge your data in, then write it out.

What you are doing above isn't comparing apples to apples. A write (put) is a single record operation. You should compare a write to a single record read (get). What you're doing is a scan (if you also attach a secondary index filter it would be a query), which is a multi-node operation. Even if it just returns a single record, it has to go to all the nodes, and in each walk the entire primary index for matches to your predicate filter.

There are a few ways to get around that. For one, you can build a secondary index on your epochDay value, and instead of a predicate filter use a secondary index filter with the BETWEEN range predicate. The predicate filter would be smaller, just the string predicate.

Second, you could use a modeling approach where the data is consolidated in a single larger record as a list or map, and you use the list or map API to get the range of elements you want in that complex data type. Take a look at the Aerospike developer blog and Aerospike code examples.

answered Jan 28 '20 at 17:04

Ronen Botzer

6,951
22
41

Good, but in the code snippet you're not setting the query to filter on it. It's not automatic, you specify it explicitly. Try moving the range predicate to that, out of the predicate filter. Assuming you're using Node.js, use the `where` : https://www.aerospike.com/apidocs/nodejs/Query.html#where – Ronen Botzer Jan 29 '20 at 04:19
i have to run query on two secondary indexes where on 1st index i have to run a simple equal query and on second i want to run a range query so can i do this in a single request as both these conditions are in 'AND' clause?? – Yash Tandon Jan 29 '20 at 05:08
If those predicates are inside the predicate filter, they don't engage the secondary indexes at all. You should think about the cardinality of the two, and use the secondary filter in the `where` that knocks out the most records. Then the predicate filter is applied only to the matched results. For example, for gender='m' and age > 18, you'd want to use the gender SI, because that eliminates half the records that aren't 'm'. – Ronen Botzer Jan 29 '20 at 05:10
well using the approach that you just described the performance has improved but still it is not as per expectations as i am getting near about 1600 TPS on querying so if u can tell what else can be done to improve the performance. – Yash Tandon Jan 29 '20 at 06:08
I've already described it in general. A key-value approach, with a predictable key name, and batch-reads will beat queries every time. Take a look at the IoT article in the developer blog, and associated code example. – Ronen Botzer Jan 29 '20 at 06:12

Why are Reads/sec very less in aerospike then as compared to Write/sec?

1 Answers1

Linked