8

In many mainstream programming languages you can use Boolean data-type (for instance, value can be either true or false) - to represent binary "true/false".

Is there a Boolean data-type in SAS too?

For example, in this code, the variable is_fruit is meant to represent a binary true (1) or false (0) situation. As I am not aware of any Boolean data-types (or length types), I work around this using numeric:

data is_fruit;
    length fruit_name $ 40 is_fruit 8.;
    input fruit_name $ is_fruit;
    datalines;
apple 1
orange 1
car 0
tree 0
chicken 0
peach 1
mango 1
human 0
;

run;

Output:

enter image description here

enter image description here

To me this is not a very elegant data structure as the variable is_fruit is really a Boolean data-type (not numeric nor character). So my question is again...

Is there a Boolean data-type (or length type) for the is_fruit above?

Atlas7
  • 2,726
  • 4
  • 27
  • 36

2 Answers2

10

Base SAS has only two datatypes: Numeric, and Character. Numeric has length 3 bytes to 8 bytes possible, character has any length (1 byte or more).

Boolean expressions in SAS are equivalent to numeric values; a 0 or Missing (Null) value is "False", any other value (negative or positive) is "True". Assigning the result of a boolean expression in SAS to a value results in 1 for True and 0 for False.

Boolean values can be stored in a 3 length Numeric safely, or can be converted to Character if space is truly a concern. SAS is optimized for 8 byte numerics, though, and other than storage space it doesn't really help to shorten the numeric (internally during processing it will be stored in 8 bytes of RAM).

Joe
  • 62,789
  • 6
  • 49
  • 67
8

As @Joe explained, there are only 2 types of data in SAS, numeric and character.

Whenever encountering a logical expression SAS will produce values 1 for True and 0 for False.

Whenever using a numerical value as a condition, only 0 and . are interpreted as False; every other equates to True.

Examples of logical expressions:

* Parentheses are optional but make things clearer;

data logicalTest;
  a = (1 < 2);      * results in a = 1;
  b = (1 > 2);      * results in b = 0;
  c = (-10e20 > .); * results in c = 1;
  d = (. = .);      * results in d = 1;
run;

As you can see, missing values are treated as numeric, and will always compare smaller than any number. So you need to be cautious comparing values when missing data are involved.


Using constraints

If this is your need, you could always add constraints to a table so that one or more columns can only accept values 0 or 1. For instance:

PROC SQL;
  create table logical 
   (id char (12) format=$12.,
    bool num format=1.);

  alter table logical
    add constraint boolean check(bool in (.,0,1))
      message = "Variable bool can only take values 0, 1, or null.";
QUIT; 

Now if we try to insert incompatible data (last row in the example below), we'll get an error message.

PROC SQL;
  insert into logical values ("ABC", 1);
  insert into logical values ("DEF", 0);
  insert into logical values ("GHI", .);
  insert into logical values ("JKL", null);
  insert into logical values ("JKL", 2);
QUIT;

Log shows:

27         PROC SQL;
28           insert into logical values ("ABC", 1);
NOTE: 1 row was inserted into WORK.LOGICAL.

(...)

32           insert into logical values ("JKL", 2);
ERROR: Variable bool can only take values 0, 1, or null. Add/Update failed for data set WORK.LOGICAL because data value(s) do not 
       comply with integrity constraint boolean.
NOTE: This insert failed while attempting to add data from VALUES clause 1 to the data set.
NOTE: Deleting the successful inserts before error noted above to restore table to a consistent state.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.

Using formats

You could also go further and add a format to the bool variable:

PROC FORMAT;
  VALUE boolean 0 = "False"
                1 = "True"
                . = "Unknown";
QUIT;

PROC SQL;
  create table logical 
   (id char (12) format=$12.,
    bool num format=boolean.);

  alter table logical
    add constraint boolean check(bool in (.,0,1))
      message = "Variable bool can only take values 0, 1, or null.";
QUIT; 

PROC SQL;
  insert into logical values ("ABC", 1);
  insert into logical values ("DEF", 0);
  insert into logical values ("GHI", .);
  insert into logical values ("JKL", null);
QUIT;

"Boolean" field with format

Dominic Comtois
  • 10,230
  • 1
  • 39
  • 61