I want to sort data in SAS data step. What exactly I mean is: the work of proc sort should be done in data step. Is there any solution?
-
Would love to know why you would want to do such a thing. – Robert Penridge Dec 18 '17 at 21:45
-
Just got idea. There is no need to do like this in complicated way. But out of enthusiasm and anxiety I want to know: is it possible in data step. I searched in google but I didn't find any answer so I posted here. – Saran Dec 19 '17 at 07:24
3 Answers
If you're looking for a data-step only solution, you can do the work of PROC SORT
with a hash table. The caveat is that you need enough memory to do it.
If you want to do a simple sort, you would load the hash table with the ordered:'yes'
option and output it to a new table. By default, ordered:yes
will sort the data in ascending order. You can specify descending
as well.
Simple Sort
data _null_;
/* Sets up PDV without loading the table */
if(0) then set sashelp.class;
/* Load sashelp.class into memory ordered by Height. Do not remove duplicates. */
dcl hash sortit(dataset:'sashelp.class', ordered:'yes', multidata:'yes');
sortit.defineKey('Height'); * Order by height;
sortit.defineData(all:'yes'); * Keep all variables in the output dataset;
sortit.defineDone();
/* Output to a dataset called class_sorted */
sortit.Output(dataset:'class_sorted');
run;
De-duping
To remove duplicates, do the exact same operation, except remove the multidata
option. In the below table, observations (8, 9) and (15, 16) are duplicates of each other. Observations 9 and 16 will be eliminated.
data _null_;
/* Sets up PDV without loading the table */
if(0) then set sashelp.class;
/* Load sashelp.class into memory ordered by Height. Do not keep duplicates. */
dcl hash sortit(dataset:'sashelp.class', ordered:'yes');
sortit.defineKey('Height'); * Order by height;
sortit.defineData(all:'yes'); * Keep all variables in the output dataset;
sortit.defineDone();
/* Output to a dataset called class_sorted */
sortit.Output(dataset:'class_sorted');
run;

- 10,597
- 1
- 12
- 21
Stu beat me to it, but provided that your dataset contains a unique key, and you can fit the whole thing in memory, you can use a hash sort, e.g.:
data _null_;
if 0 then set sashelp.class;
declare hash h(dataset:"sashelp.class",ordered:"a");
rc = h.definekey("age","sex","name");
rc = h.definedata(ALL:'yes');
rc = h.definedone();
rc = h.output(dataset:"class_sorted");
stop;
run;
If you are really determined to avoid using any built-in sort methods, a particularly silly approach is to load the whole dataset into a series of temporary arrays, sort the arrays using a hand-coded algorithm, and export again:
https://codereview.stackexchange.com/questions/79952/quicksort-in-sas-for-sorting-datasets

- 9,501
- 2
- 24
- 35
There is solution in use proc ds2.
/*Just prepare dataset, because DS2 responds with an error on libraries like sashelp. */
data sql_prep;
set sashelp.class;
run;
/*Delete test dataset before ds2 func, to avoid errors*/
proc datasets nodetails nolist;
delete test;
run;
proc ds2;
data test;
method run();
set {select * from sql_prep order by Weight};
end;
enddata;
run;
quit;
More info about ds2 errors with sashelp libs.
Appendix into ds2 docs, about sql in ds2.

- 716
- 1
- 10
- 25