1

I wrote the following script, see below, that is supposed to drop columns in my table that has around 70000 variables.

I've only given you a tiny part of my SQL script in the condition. Usually after all the conditions are put, the query will retrieve me 6000 variables-ish * 6.

The macro variable droplist_u is too small to keep everything. Do you have an alternative to using the macrovariable and avoid the error "expression length (65535) exceeds maximum length (65534)"?

Edited: Once I fetch all the variables in droplist, I'm converting it to a macrovariable with that command line %put droplist_u: &droplist_u;

%let pays1=IK ;

%do nopays=1 %to 1;

%do l=0 %to 5;

/*one off*/
proc sql;
select catt(nomvar,"_t&&l") into : droplist_u separated by ' ' from dico&&pays&nopays
where (intitule like 'RT1%' or intitule like '%RT12%'
or intitule like 'IS-PPI%'
or intitule like 'IS-IP%'
or intitule like 'IS-IP-SA%'
or intitule like 'IS-IMPR%'
or intitule like 'IS-IMPX%'
or intitule like 'IS-IMPZ%'
or intitule like 'B-E36%'
or intitule like 'B-D_F%'
or intitule like 'B-D%'
or intitule like 'B_C%'
or intitule like 'MIG_ING%'
or intitule like 'MIG_NRG%'
or intitule like 'MIG_CAG%'
or intitule like 'MIG_COG%'
or intitule like 'MIG_DCOG%'
or intitule like '%C191%'
or intitule like '%C192%'
or intitule like '%C20_C21%'
or intitule like '%C20%'
or intitule like '%C201%'
or intitule like '%C2011%'
or intitule like '%C2012%'
or intitule like '%C2013%'
or intitule like '%C2014%'
or intitule like '%C2015%'
or intitule like '%C2016%'
or intitule like '%C2017%'
or intitule like '%C202%'
or intitule like '%C203%'
or intitule like '%C204%'
or intitule like '%C2041%'
or intitule like '%C2042%'
or intitule like '%C205%'
or intitule like '%C2051%'
or intitule like '%C2052%'
or intitule like '%C2053%'
or intitule like '%C2059%'
or intitule like '%C206%'
or intitule like '%C21%'
or intitule like '%C211%'
or intitule like '%PCH_SM%');
quit;

%put droplist_u: &droplist_u;

data a&&pays&nopays;
set a&&pays&nopays (drop=&droplist_u);
run;

%end;
%end;
%mend;


%testsql;
run;`
Andy K
  • 4,944
  • 10
  • 53
  • 82

2 Answers2

2

You could try writing your datastep (code) to a text file, then %include ing that text file - see pseudo code below (untested):

proc sql;
create table vars as
select catt(nomvar,"_t&&l") as var 
from dico&&pays&nopays
where (intitule like 'RT1%' or intitule like '%RT12%'
or intitule like 'IS-PPI%'
xxx
xxx
xxx
;quit;

filename tempf temp;
data _null_;
   file tempf;
   set vars end=lastobs;
   if _n_=1 then put "data a&&pays&nopays;set a&&pays&nopays (drop=";
   put var /;
   if lastobs then put ");run;";
run;

%inc tempf;

filename tempf clear;

Be aware that you may hit memory limits by dealing with such large datasets (depending on your operating environment obviously)

Community
  • 1
  • 1
Allan Bowe
  • 12,306
  • 19
  • 75
  • 124
  • Hi RawFocus, thanks for your answer. Before going further, do you understand the objective of my script? – Andy K Jun 03 '14 at 10:58
  • You are retrieving the variables you want to drop, then dropping those variables using a datastep set statement as part of a macro loop. I edited my post - do I understand you, and do you understand me? – Allan Bowe Jun 03 '14 at 11:04
  • Yes, indeed you understand me. Sorry because sometimes, I feel I'm not clear enough so I clarify. :) – Andy K Jun 03 '14 at 11:06
  • NP. Does my answer make sense though? We are basically using a temporary text file instead of a macro variable.. – Allan Bowe Jun 03 '14 at 11:08
  • I think I got it. However the separated is not the best solution when you are creating a table with sas. That one should be better `proc sql; create table vars as select catt(nomvar,"_test")||' ' as var from dicofr where (intitule like 'RT1%' or intitule like '%RT12%' or intitule like 'IS-PPI%') ;quit;` – Andy K Jun 03 '14 at 11:13
  • That was a mistake, I'd already fixed it (sorry). Although the code here is just a suggestion to explain the approach.. (for an alternative to a macro variable) – Allan Bowe Jun 03 '14 at 11:27
  • 1
    No worries, @RawFocus. It was only a gentle advice. – Andy K Jun 03 '14 at 12:43
  • One final point. You mention that you have 70,000 variables, and need to keep around 6,000. For this reason, it may be more efficient to use a keep statement instead of a drop.. – Allan Bowe Jun 04 '14 at 12:47
  • Hi RawFocus, why? The keep from my humble sas knowledge is to keep the variable, if I'm correct? – Andy K Jun 04 '14 at 13:40
  • Well, obviously the query would need to be refactored (or additional logic added) to identify the variables that need to be kept. You would also want to perform tests to compare the performance. However from my perspective, it seems like less overhead to "keep" 6000 variables than to "drop" 64000. Is just a thought! – Allan Bowe Jun 04 '14 at 15:23
  • 1
    After long discussion, I will do the other way e.g. to discard the unwanted variables rather than keeping the wanted one. Why? I have an idea on which one to discard but the ones are in fluctuating state , not knowing if we keep some of them. Bit brain dead after playing with SAS combine with SQL. Cheers again, @RawFocus. – Andy K Jun 05 '14 at 15:49
0

What can create a seperate macro variable for each column you want to drop with a sequence number in it and use a loop to put them all in the keep.

%macro dropColumns;
%let pays1=IK ;
%do nopays=1 %to 1;
data _null_;
 set dico&&pays&nopays 
 (where=(intitule like 'RT1%' 
  ...
  or intitule like '%PCH_SM%')) 
 end=last;
 %do l=0 %to 5;
  count + 1;
  call symput(cats('todelete',count),catt(nomvar,"_t&&l"));
 %end;
 if last then call symput('todeleteobs',count);
run;

data a&&pays&nopays;
 data a&&pays&nopays (drop=
 %do i = 1 %to &todeleteobs.;
  &&todelete&i.
 %end;
 );
run;

%end;
%mend;

%dropColumns;
Laurent de Walick
  • 2,154
  • 14
  • 11
  • I'll have a look. Merci Laurent. – Andy K Jun 03 '14 at 12:42
  • this would also work - so instead of one large macro variable, you have several thousand smaller ones. And, as rightfully implied, those macro variables ought to be deleted once they are finished with.. Although, presuming they have local scope (not declared globally), they would be erased anyway after the macro has executed. – Allan Bowe Jun 03 '14 at 12:57