0

I have a question on the SAS code below. I am new to arrays and what the below code is doing exactly. My understanding is that there are two indices below. I believe this is deduping the SAS data set by the two indices. I am not exactly sure. Thanks for your help!

data unix.txn_match_part_four_01;
set unix.txn_match_part_four_00;
format id_one1-id_one95000 BEST12. id_two1-id_two95000 BEST12.;

array id_one{95000} id_one1-id_one95000;
array id_two{95000} id_two1-id_two95000;

retain id_one1-id_one95000;
retain id_two1-id_two95000;

if _n_ = 1 then i = 1;
else i + 1;

do j = 1 to i;
if clm_idx = id_one{j} then delete;
end;

do k = 1 to i;
if txn_idx = id_two{k} then delete;
end;

id_one{i}=clm_idx;
id_two{i}=txn_idx;

run;
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • No other comments? Do you have examples of logs where it was used? At first glance it looks like an attempt to transpose the data, but that would normally would limit the number of observations written. – Tom Apr 09 '18 at 20:44
  • It does appear to be deduping on either claim or transaction index. The result is a set of distinct items paired. For some other reason the row in which such pairing occurs is also being tracked in a rather large 95,000 item array, which makes for a wide data set (190,002 + # satellite variables). If the number of rows is 95,000, it may be doing some sort of matrix'ified bookkeeping (vector to square) – Richard Apr 10 '18 at 01:12

0 Answers0