Regression Analysis by SAS

SAS for regression analysis

I once confound why there are still lots of institutes and companies using SAS.
Actually, codes of SAS is quite concise and efficient, enough for business analysis.
I conclude my course codes there.

Part one: introduction

/*???????????????Library,??????????(libname)????,??????libname????*/
/*libname?????????,????????(work)*/
libname Reg "/folders/myfolders/sasuser.v94/";
run;   #here we define a library named reg. In sas studio, we can only load our data into the default path /folders/myfolders
/*** Identify the output SAS data set.***/
/*** Specify the input file.***/
	/*** Specify the input file is a delimited file.***/
/*** Replace the data set if it exists.***/
	/*** Specify delimiter as an & (ampersand).***/
/*** Generate variable names from first row of data.***/
proc import out=sasdata1
             datafile='/folders/myfolders/sasuser.v94/Weight.txt'
             dbms=dlm
             replace;
delimiter='&';
getnames=yes;
run;quit;
/***Data step can also import dataset from external le,sas codes can be seen as following***/   #this function can be realized in formal sas.
options replace;
data sasdata12;
infile 'C:\Users\John\Desktop\sasdata.txt' firstobs=2 delimiter='&';
input Region$ State$ Month$ Expenses Revenue;
run;quit;

/***Proc Print***/
proc print data=sasdata1;
run;quit; 
/***Speify first n rows to be printed***/
proc print data=sasdata1(obs=3);
run;quit; 

/***Proc contents***/
proc Contents data=sasdata1;
run;quit;

/***Proc Mean***/
proc means data=sasdata1;
var revenue;
run;quit;
/***We can give other options to the procedure to calculate a number of other quantities such as***/
proc means data=sasdata1 N mean std min q1 median q3 max;
var revenue expenses;
run;quit;
/****BY statement**/
    /**	To use BY statement, data must be sorted in the same order in variable-list listed after BY statement ***/
proc sort data=sasdata1 out=sasdata1_sorted;
BY Region;
run;
proc means data=sasdata12_sorted N mean std min q1 median q3 max;
BY Region;
var revenue expenses;
run;quit;
/***Question: Can you sort the some variable by descending?***/

/**Proc reg**/
Proc reg data=sasdata1 plots=none;
model expenses=revenue; #y~x
run;quit;
/***construct a dataset that includes such things as tted values and the cor-
responding residuals for each point***/
Proc reg data=sasdata1 plots=none;
model expenses=revenue;
plot expenses*revenue;
output out=expense_out
predicted=fitted
residual=residuals;
run;quit;

Plot two: graph

libname part2 "/folders/myfolders/sasuser.v94/";
run;
proc import out=part2.weight
datafile='/folders/myfolders/sasuser.v94/weight.xls'
dbms=xls replace;
/**sheet='sheet1';**/
getnames=yes;
run;quit;
/*1. Plot procedure creates scatter plots*/
proc plot data=part2.weight;
plot weight*height;
run;quit;
   /*use characters * or + for scatter plots so can make nice plots*/
proc plot data=part2.weight;
plot weight*height='*';
run;quit;
   /*Give title to your plots and add label defined by yourself*/
proc plot data=part2.weight;
plot weight*height='o';
title "Simple Linear Regression";
Label weight="child's weight";
Label height="child's height";
run;quit;
/*2. Gplot procedure
     basic plotting method in SAS
     it can produce separate graphics plot
     this plot can be saved as a number of different types of image files for later use
     looks nicer than plots produced by PROC PLOT.
*/
/**!!!!gplot can not be used in SAS university**/
proc gplot data=part2.weight;
plot weight*height;
symbol Value=star color=blue;
Title "Simple Linear Regression";
Label weight="child's weight";
Label height="child's height";
run;quit;
/*3. Sgplot procedure
     better than PROC gplot
     creates one or more plots and overlays them on a single set of axes
     create statistical graphics such as histograms and regression plots, in addition to scatter plots and line plots.
*/
/*ods pdf file="E:\saslab\weight.pdf";*/
proc sgplot data=part2.weight;
scatter x=height y=weight;
Title "Simple Linear Regression";
Label weight="child's weight";
Label height="child's height";
run;quit;
/*ods pdf close;*/
/* ellipse x=height y=weight; Adds a confidence or prediction ellipse to another plot.
   reg x=height y=weight;
*/


/*4. ODS graphics; ODS(Output Delivery System) can create output in different formats. 
                   each ODS destination statement creates output for a specific type of viewer.

*/
/*simple examplle using ods graphics. The output is a pdf file which is stored in "E:\saslab"*/
ods listing gpath="/folders/myfolders/saslab"; 
/*This statement would create a list of png outputs and save them in a subfolder named saslab in myfolders  */
ods graphics on;
proc reg data=part2.weight;
model weight=height;
run;quit;
ods graphics off;
ods listing close;
ods pdf file="/folders/myfolders/saslab/reg.pdf";
 /*This statement would create pdf output and save it in a file named reg.pdf in a folder named saslab on the 
                                    myfolders
                                     If you are using a destination��such as PDF or RTF��where images andtabular output are saved together in 
                                     the same file, you can use the FILE= option to tell SAS where to save your output.2    */

ods graphics on;
proc reg data=part2.weight;
model weight=height;
run;quit;
ods graphics off;
ods pdf close;


ods rtf file="E:\saslab\reg.doc";
ods graphics on;
proc reg data=part2.weight;
model weight=height;
run;quit;
ods graphics off;
ods rtf close;


/*5. proc corr 
     This is the procedure to nd sample correlation coecients and to test the null hypothesis
     that the population correlation is 0.
*/
proc corr data=part2.weight;
var weight height;
run;quit;
/*An alternative way that is sometimes useful and gives a more compact output is:*/
proc corr data=part2.weight;
var weight;
with height;
run;quit;
 /*This will correlate each of the variables in the Var statement with each variable in the WITH statement.*/



/*6. proc univariate
     Proc univarate not only use for statistical analysis such as descriptive statistics,goodness-
of-fit tests but also provide histogram,Q-Q plot,P-P plot and so on.
*/

proc univariate data=part2.weight;
var weight height;
histogram weight height;
run;quit;
/*This procedure will create the histograms of Variables Weight and Height.*/
/*Question: How to get Q-Q plot and P-P plot between Weight and Height?*/


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章