-
Notifications
You must be signed in to change notification settings - Fork 1
Home
The WHAT and HOW of custom WARNINGs is covered in the Code tab of this repository. What follows in the Wiki is a series of applications of this technique.
Suppose you are mapping numeric variable GENDER
to character variable SEX
. You have the numeric values 1
and 2
in the current cut of data, but you worry that other values (missing, 3
, 99
) might show up in the next data cut.
data sdtm.dm;
set clinical.demomstr;
if gender = 1 then sex = 'M';
else if gender = 2 then sex = 'F';
else put 'W' 'ARNING: unaccounted for value of ' gender=;
run;
Suppose you have temperature data in degrees Fahrenheit. You might put a range check in your code to capture highly unlikely values.
data _null_;
set clinical.vsmstr;
if 95 < temp_f < 102 then
put 'W' 'ARNING: suspicious value for ' temp_f=;
run;
Maybe you've got some code that will only work in 9.4.
data _null_
if &sysver < 9.4 then
put 'W' 'ARNING: minimum SAS version for this program is 9.4.';
run;
Maybe your macro will crash if one of the parameters is missing.
%if %nrbquote(&pos) eq %str() %then
%put %str(W)ARNING: POS is a required parameter;
Maybe your macro will crash if an invalid value is specified for one of the parameters.
%if ^(%upcase(&pos) in (Y N YES NO)) %then
%put %str(E)RROR: POS [&pos.] must be YES, NO;
Phone calls, meeting reminders, instant messages, etc. The string of interruptions is never ending! When interruptions occur, preserve your sanity by leaving bread crumbs in your program.
%put %str(W)ARNING: stopped in the middle of <your text here>;
Sometimes you write code that's only meant to be in effect for a short time, after which you want to remove the code.
data _null_;
if today() > '21oct2016'd then
put 'W' 'ARNING: replace dummy treatment assignments with real ones.';
run;
data _null_;
if today() > '21oct2016'd then
put 'W' 'ARNING: analysis cutoff date is no longer valid.';
run;
Sometimes we assume that a list of BY
variables is going to uniquely identify records in our dataset. Cover that assumption with a custom WARNING.
DATA test;
set ts1;
by usubjid sid1a;
if not (first.sid1a and last.sid1a) then
put "WARN" "ING: Multiple TS1.SID1A records for " usubjid= sid1a=;
RUN;
In GPLOT
you frequently have to hardcode the ORDER=
option on the AXISn
statement. If the data range grows in a subsequent data cut you could be clipping data points.
axis1 order=(20 to 40 by 5);
proc sql noprint;
select min(aval), max(aval)
into :minaval, maxaval
from plotdata
;
quit;
data _null_;
if &minval < 20 or 40 < &maxval then
put 'W' 'ARNING: revisit your ORDER= option';
run;
Assigning lengths to character variables involves making assumptions about future values. Protect yourself by checking the lengths of character variables which you know have the potential to contain long strings.
data sdtm.cm;
set clinical.cmmstr;
if length(cmindc) > 200 then
put 'W' 'ARNING: CMINDC > 200 char: ' usubjid= cmindc=;
run;
The first data cut for a clinical trial often involves an empty dataset or two. Rather than try to use the empty dataset in your program, it might be easier to set up a custom WARNING to notify you once the dataset finally get some records in it.
%macro dataempty(data);
%let numobs = 0;
%let dsid = %sysfunc(open(&data.));
%if &dsid %then %do;
%let numobs = %sysfunc(attrn(&dsid.,nobs));
%let rc = %sysfunc(close(&dsid.));
%end;
%if &numobs > 0 %then
%put %str(W)ARNING: dataset [&data] is not empty;
%mend dataempty;
%dataempty(clinical.deaths);
In the following example we expect all USUBJID
values to be present in dataset DM
. If we find a USUBJID
value that is not in DM
, we throw a WARNING.
DATA LB_1;
merge
lab (in=inLB)
sdtm.dm (in=inDM)
;
by usubjid;
if inLB;
if not (inDM) then
put 'WAR' 'NING: Subject not in SDTM.DM datasets ' usubjid=;
RUN;
Sometimes there are two input datasets that are supposed to have the same number or type of records in them (i.e., if a particular USUBJID
is in dataset A, that same USUBJID
should also be in dataset B). This example shows how to check for this type of consistency between two datasets.
data work.ORQues ;
merge work.ORQ (in=_inORQ)
work.ORQUES_Raw (in=_inORQUES)
;
by USUBJID QSSPID QSREFID ;
inORQUES = _inORQUES ;
if (not _inORQ) and first.QSREFID then
put 'WAR' 'NING: Record in ORQUES does not match ORQ ' /
'WAR' 'NING- ' USUBJID= QSSPID= QSREFID= / ;
if (QUCPYNL eq 'NO') and (inORQUES) and first.QSREFID then
put 'WAR' 'NING: Record with done=NO in ORQ matches records in ORQUES ' /
'WAR' 'NING- ' USUBJID= QSSPID= QSREFID= COMPDF= QUSC= / ;
if (QUCPYNL ne '') or (inORQUES) then output ;
run ;
Writing macros robust enough to handle any type of data is often quite difficult. Often it makes more sense to write custom WARNINGs to make you aware of cases that don't fit your assumptions about "this is what the data should look like". In the following macro, which converts YYYYMMDD dates to ISO8601 format, only the easy cases are handled by the macro, with all "interesting" data resulting in a custom WARNING.
*---------- convert character YYMMDD8. dates to ISO8601 dates ----------;
%macro char2iso(dateVar);
if &dateVar. ne '' then do ;
%*--- write W@RNING if any non-digit characters ---;
if compress(&dateVar., '', 'kd') ne &dateVar. then
put "WAR" "NING: Character date in &dateVar. contains non-digit characters " &dateVar.= ;
%*--- full date case ---;
else if length(&dateVar.) eq 8 then
&outVar. = put(input(&dateVar., YYMMDD8.), IS8601DA10.) ;
%*--- write W@RNING if unexpected length ---;
else if length(&dateVar.) not in (4, 6) then
put "WAR" "NING: Character date in &dateVar. has unexpected length " &dateVar.= ;
%*--- year (lenght 4) and year/month (length 6) cases ---;
else &outVar. = catX('-', subStrN(&dateVar., 1, 4), subStrN(&dateVar., 5, 2)) ;
end ;
%mend char2iso;
The duplicate records check is common enough that you will probably want to make a macro out of it. Here is one possible implementation.
%macro dupcheck(data=,var=);
%let csvar = %sysfunc(translate(&var,%str(,),%str( )));
proc sql noprint;
select &csvar
from &data
group by &csvar
having count(*) > 1
;
quit;
%if &sqlobs > 0 %then
%put %str(W)ARNING: obs in [&data] are not uniquely identified by [&var].;
%mend dupcheck;
%dupcheck
(data=sdtm.ae
,var=usubjid aeseq
);
Sometimes you will have partially-missing or ambiguous data that makes it difficult to put records in chronological order. Consider using a custom WARNING to help identify situations where sorting one way (by AESTDTC
) does not give the same result as sorting another way (by AESEQ
).
data _null_;
set events;
by usubjid aestdtc;
retain lastseq;
if first.usubjid then call missing(lastseq);
if not (aeseq >= lastseq) then
put 'W' 'ARNING: records are out of order: ' / usubjid= / lastseq= / aeseq= / ;
lastseq = aeseq;
run;
Note the use of the /
character in this put
statement. This character acts as a hard return in the log, putting the various text and variable values on separate lines. This can help with readability.
Next page: Minutiae