For large datasets, koalas.head(n)
function takes a really long time. I understand that it tries to bring back all the data in driver node and then present the absolutely top n rows.
Is there any quick way to analyse top n rows in koalas such that only single or few partitions are involved to get the intended result? I do not want to necessarily see the absolute first n rows, they can be randomly distributed across different executor nodes or even reside within the same partition.