1

With DDL and profile yaml below, I generate random data for my table using cassandra-stress. The results I get for the columns amount and status don't match expectation. The random values seem to be drawn once per partition, not for each row.

If, for example, cassandra-stress generates 5 rows with the same business_date (i.e. one partition) the amount and status values are repeated 5 times, the "next" random value comes when the business_date changes. How can I make this so I get a new draw of amount and status for every row?

Sample output, notice last two columns change value only once first column changes.

2018-09-26,y~8.>6MZ,00000000-0004-0a3c-0000-000000040a3c,5.133114565746717E10,3PR|I{3B
2018-09-26,y~8.>6MZ,00000000-004c-4e7e-0000-0000004c4e7e,5.133114565746717E10,3PR|I{3B
2018-09-26,y~8.>6MZ,00000000-003d-b97f-0000-0000003db97f,5.133114565746717E10,3PR|I{3B
2018-09-26,y~8.>6MZ,00000000-004f-db3f-0000-0000004fdb3f,5.133114565746717E10,3PR|I{3B
2018-09-26,y~8.>6MZ,00000000-008c-f0ea-0000-0000008cf0ea,5.133114565746717E10,3PR|I{3B
2018-10-14,Y ?R|    |u,00000000-002b-5707-0000-0000002b5707,6.698617679577381E10,,fkb[cU~N!
.
.
.

Table structure:

CREATE TABLE IF NOT EXISTS record (
business_date date,
region text,
id uuid,
status text,
amount double,
PRIMARY KEY (business_date, region, id)
);

Profile YAML:

keyspace: dev
table: record
columnspec:
 - name: business_date
   population: uniform(17800..17845)
 - name: region
   size: fixed(10)
   population: seq(10..16)
   cluster: fixed(7)
 - name: id
   size: fixed(32)
   population: seq(1..10M)
   cluster: fixed(5)
 - name: status
   size: fixed(10)
   population: uniform(1000..1010)
 - name: amount
   population: uniform(500000..10M)
insert:
   partitions: fixed(1)
   select: fixed(1)/35
queries:
   selectall:
    cql: select * from record where business_date = ? and region = ?
    fields : samerow
ankit
  • 350
  • 2
  • 7
  • 12

0 Answers0