SAS for regression analysis
I once confound why there are still lots of institutes and companies using SAS.
Actually, codes of SAS is quite concise and efficient, enough for business analysis.
I conclude my course codes there.
Part one: introduction
/*???????????????Library,??????????(libname)????,??????libname????*/
/*libname?????????,????????(work)*/
libname Reg "/folders/myfolders/sasuser.v94/";
run; #here we define a library named reg. In sas studio, we can only load our data into the default path /folders/myfolders
/*** Identify the output SAS data set.***/
/*** Specify the input file.***/
/*** Specify the input file is a delimited file.***/
/*** Replace the data set if it exists.***/
/*** Specify delimiter as an & (ampersand).***/
/*** Generate variable names from first row of data.***/
proc import out=sasdata1
datafile='/folders/myfolders/sasuser.v94/Weight.txt'
dbms=dlm
replace;
delimiter='&';
getnames=yes;
run;quit;
/***Data step can also import dataset from external le,sas codes can be seen as following***/ #this function can be realized in formal sas.
options replace;
data sasdata12;
infile 'C:\Users\John\Desktop\sasdata.txt' firstobs=2 delimiter='&';
input Region$ State$ Month$ Expenses Revenue;
run;quit;
/***Proc Print***/
proc print data=sasdata1;
run;quit;
/***Speify first n rows to be printed***/
proc print data=sasdata1(obs=3);
run;quit;
/***Proc contents***/
proc Contents data=sasdata1;
run;quit;
/***Proc Mean***/
proc means data=sasdata1;
var revenue;
run;quit;
/***We can give other options to the procedure to calculate a number of other quantities such as***/
proc means data=sasdata1 N mean std min q1 median q3 max;
var revenue expenses;
run;quit;
/****BY statement**/
/** To use BY statement, data must be sorted in the same order in variable-list listed after BY statement ***/
proc sort data=sasdata1 out=sasdata1_sorted;
BY Region;
run;
proc means data=sasdata12_sorted N mean std min q1 median q3 max;
BY Region;
var revenue expenses;
run;quit;
/***Question: Can you sort the some variable by descending?***/
/**Proc reg**/
Proc reg data=sasdata1 plots=none;
model expenses=revenue; #y~x
run;quit;
/***construct a dataset that includes such things as tted values and the cor-
responding residuals for each point***/
Proc reg data=sasdata1 plots=none;
model expenses=revenue;
plot expenses*revenue;
output out=expense_out
predicted=fitted
residual=residuals;
run;quit;
Plot two: graph
libname part2 "/folders/myfolders/sasuser.v94/";
run;
proc import out=part2.weight
datafile='/folders/myfolders/sasuser.v94/weight.xls'
dbms=xls replace;
/**sheet='sheet1';**/
getnames=yes;
run;quit;
/*1. Plot procedure creates scatter plots*/
proc plot data=part2.weight;
plot weight*height;
run;quit;
/*use characters * or + for scatter plots so can make nice plots*/
proc plot data=part2.weight;
plot weight*height='*';
run;quit;
/*Give title to your plots and add label defined by yourself*/
proc plot data=part2.weight;
plot weight*height='o';
title "Simple Linear Regression";
Label weight="child's weight";
Label height="child's height";
run;quit;
/*2. Gplot procedure
basic plotting method in SAS
it can produce separate graphics plot
this plot can be saved as a number of different types of image files for later use
looks nicer than plots produced by PROC PLOT.
*/
/**!!!!gplot can not be used in SAS university**/
proc gplot data=part2.weight;
plot weight*height;
symbol Value=star color=blue;
Title "Simple Linear Regression";
Label weight="child's weight";
Label height="child's height";
run;quit;
/*3. Sgplot procedure
better than PROC gplot
creates one or more plots and overlays them on a single set of axes
create statistical graphics such as histograms and regression plots, in addition to scatter plots and line plots.
*/
/*ods pdf file="E:\saslab\weight.pdf";*/
proc sgplot data=part2.weight;
scatter x=height y=weight;
Title "Simple Linear Regression";
Label weight="child's weight";
Label height="child's height";
run;quit;
/*ods pdf close;*/
/* ellipse x=height y=weight; Adds a confidence or prediction ellipse to another plot.
reg x=height y=weight;
*/
/*4. ODS graphics; ODS(Output Delivery System) can create output in different formats.
each ODS destination statement creates output for a specific type of viewer.
*/
/*simple examplle using ods graphics. The output is a pdf file which is stored in "E:\saslab"*/
ods listing gpath="/folders/myfolders/saslab";
/*This statement would create a list of png outputs and save them in a subfolder named saslab in myfolders */
ods graphics on;
proc reg data=part2.weight;
model weight=height;
run;quit;
ods graphics off;
ods listing close;
ods pdf file="/folders/myfolders/saslab/reg.pdf";
/*This statement would create pdf output and save it in a file named reg.pdf in a folder named saslab on the
myfolders
If you are using a destination��such as PDF or RTF��where images andtabular output are saved together in
the same file, you can use the FILE= option to tell SAS where to save your output.2 */
ods graphics on;
proc reg data=part2.weight;
model weight=height;
run;quit;
ods graphics off;
ods pdf close;
ods rtf file="E:\saslab\reg.doc";
ods graphics on;
proc reg data=part2.weight;
model weight=height;
run;quit;
ods graphics off;
ods rtf close;
/*5. proc corr
This is the procedure to nd sample correlation coecients and to test the null hypothesis
that the population correlation is 0.
*/
proc corr data=part2.weight;
var weight height;
run;quit;
/*An alternative way that is sometimes useful and gives a more compact output is:*/
proc corr data=part2.weight;
var weight;
with height;
run;quit;
/*This will correlate each of the variables in the Var statement with each variable in the WITH statement.*/
/*6. proc univariate
Proc univarate not only use for statistical analysis such as descriptive statistics,goodness-
of-fit tests but also provide histogram,Q-Q plot,P-P plot and so on.
*/
proc univariate data=part2.weight;
var weight height;
histogram weight height;
run;quit;
/*This procedure will create the histograms of Variables Weight and Height.*/
/*Question: How to get Q-Q plot and P-P plot between Weight and Height?*/