The SQL "partition by" translates to ECL as the GROUP() function, and ranking in ECL is easily done using PROJECT's COUNTER within each group.
I found an SQL example of this here and re-wrote that example in ECL:
rec := RECORD
UNSIGNED1 id;
STRING15 name;
STRING10 category;
UDECIMAL6_2 price;
END;
ds := DATASET([{105,'Country Living','lifestyle',1.70},
{108,'News Magazine','news',3.35},
{115,'Decibel','music',6.50},
{123,'Drum Magazine','music',6.50},
{145,'Sunset','lifestyle',12.00},
{155,'World','lifestyle',9.50},
{158,'Keyboard','music',8.45}],rec);
sds := SORT(ds,category,-price);
gds := GROUP(sds,category); //creates a subgroup for each category
//each subsequent operation on GROUPed data
// executes separately and independently on each subgroup
RankInPartition := PROJECT(gds,
TRANSFORM({gds,UNSIGNED1 price_rank},
SELF.price_rank := COUNTER,
SELF := LEFT));
OUTPUT(RankInPartition,{category,name,price,price_rank});
You could also use ITERATE to do this, like this:
sds := SORT(ds,category,-price);
tds := TABLE(sds,{sds,price_rank := 0});
RankInPartition := ITERATE(tds,
TRANSFORM(RECORDOF(tds),
SELF.price_rank :=
IF(LEFT.category=RIGHT.category,LEFT.price_rank+1,1),
SELF := RIGHT));
but this would be a global operation, and on huge datasets (we are working with "Big Data" here) it would likely be quite slow compared to the GROUP solution.
For the GROUP solution, each subsequent operation executes on each subgroup separately and independently. That means that if you have 40 nodes, at least 40 separate subgroups are executing in parallel for each grouped operation.
So in this example, the PROJECT ranks each subgroup separately and independently, adding the work field (price_rank) at the same time. But the ITERATE version requires a TABLE to add the new work field (price_rank) before ITERATE can work. And both of those operations are global.
HTH,
Richard