Wednesday, July 18, 2012

Difference between two consecutive rows in SAS



Dataset is shown on right side. Here Visit, Time and Page are given and we need to calculate the duration by subtracting two consecutive rows.

E.g
Duration of Page "A" in Visit 1 is 23:39:23 - 23:34:10.
For Calculating Time difference we first need to convert this into time variable.

data libname.dataset;
set libname.dataset;
time1 = hms (scan(time,1,':'), scan(time,2,':'), scan(time,3,':');
run;

Now this time can be use for calculation. Now, Try the  macro below for calculation of duration. For Difference calculation I am using proc sql .

Note: Duration of last page of every visit is marked as Zero (as it can't be calculated).


Copy the macro below in file and for using type
%include "<filename complete path>";

%time_calc( libname,input_dataset,output_dataset,visit,time1,Duration);

Macro


/* Macro to calculate Page duration */
/* Sort the table based on Routes and Time in Ascending order*/
/* Note last page duration can't be calculated and is marked as zero*/

%macro time_calc(
libname /*libname */,
dataset_in /* input dataset*/,
dataset_out /*output dataset*/,
routes /*route or visitor*/,
time /* time of Page visit should be in number or time format*/,
Duration /*Column will be created for each page duration*/
);
data &libname..&dataset_out;
set &libname..&dataset_in;
by &routes &time;
id = _N_;
&Duration = 0;
run;
proc sql noprint;
select max(&routes) into :No_term
from &libname..&dataset_out ;
quit;
/*%let No_term = 100;*/
%do x = 1 %to &No_term;
/*%let x = ;*/
/*%let n1 = 1;*/
proc sql noprint;
select count(*) into :i
from &libname..&dataset_out where &routes = &x;
quit;
proc sql noprint;
select min( id ) into :n2
from &libname..&dataset_out where &routes = &x;
quit;

%do n1 = 1 %to &i;
/*%let n2 = 2;*/
/*%put &n1;*/
proc sql noprint;
select &time into :var1
from &libname..&dataset_out where &routes = &x and id = &n2 + &n1 -1;
select &time into :var2
from &libname..&dataset_out where routes = &x and id = &n2+&n1;
update &libname..&dataset_out set &Duration = &var2 - &var1 where &routes = &x and id =&n2+ &n1 -1;
quit;
%end;
%end;
run;
%mend time_calc;
);





















































































No comments:

Post a Comment