R-squared, often written as r2, is a measure of how well a linear regression model fits a dataset.
This value represents the proportion of the variance in the response variable that can be explained by the predictor variable.
The value for r2 can range from 0 to 1:
- A value of 0 indicates that the response variable cannot be explained by the predictor variable at all.
- A value of 1 indicates that the response variable can be perfectly explained without error by the predictor variable.
Related: What is a Good R-squared Value?
The following step-by-step example shows how to calculate the R-squared value for a simple linear regression model in SAS.
Step 1: Create the Data
For this example, we’ll create a dataset that contains the total hours studied and final exam score for 15 students.
We’ll to fit a simple linear regression model using hours as the predictor variable and score as the response variable.
The following code shows how to create this dataset in SAS:
/*create dataset*/ data exam_data; input hours score; datalines; 1 64 2 66 4 76 5 73 5 74 6 81 6 83 7 82 8 80 10 88 11 84 11 82 12 91 12 93 14 89 ; run; /*view dataset*/ proc print data=exam_data;
Step 2: Fit the Simple Linear Regression Model
Next, we’ll use proc reg to fit the simple linear regression model:
/*fit simple linear regression model*/ proc reg data=exam_data; model score = hours; run;
Notice that the R-squared value in the output is 0.8310.
This means 83.1% of the variation in exam scores can be explained by the number of hours studied.
Step 3: Extract R-Squared Value of Regression Model
If you only want to view the R-squared value of this model and none of the other output results, you can use the following code:
/*fit simple linear regression model*/ proc reg data=exam_data outest=outest noprint; model score = hours / rsquare; run; quit; /*print R-squared value of model*/ proc print data=outest; var _RSQ_; run;
Notice that only the R-squared value of 0.83098 is shown in the output.
Note: The argument noprint in proc reg tells SAS not to print the entire output of regression results as it did in the previous step.
The following tutorials explain how to perform other common tasks in SAS: