1 Paragraph Response to 2 Classmate’s SPSS Posts: 2 Paragraphs Total
By Day 5
Respond to at least two of your colleagues’ posts and provide a constructive comment on their assessment of diagnostics.
- Were all assumptions tested for?
- Are there some violations that the model might be robust against? Why or why not?
- Explain and provide any additional resources (i.e., web links, articles, etc.) to provide your colleague with addressing diagnostic issues.
Classmate 1 (Natalie):
“Main Question Post – Discussion – Week 10
Variables
The dependent variable using the General Social Survey dataset is “Respondent Socioeconomic Index”. The two independent variables using the General Social Survey dataset are “R’s Occupational Prestige Score” which is measured on an interval/ratio level and “Married” which is the dummy variable measured on a nominal scale.
Research Questions
What is the relationship between the respondent’s occupational prestige score and their marital status of being married, and the respondent’s socioeconomic index?
Null Hypotheses
There is no relationship between the respondent’s occupational prestige score and marital status of being married, and the respondent’s socioeconomic index.
Research design
Although we have looked at building multiple regression models, we have not engaged with the assumptions and issues which are also vital to achieving valid and reliable results. This research design therefore seeks to recode categorical variables to be used in a regression model, and interpret the coefficients. The Model Summary table shows the Durbin-Watson value which provides information on the independence of errors. The Durbin-Watson value of 1.854 is between the values of 0 and 4.0 and indicates that there is no correlation between the residuals. The ANOVA table test the overall significance of the regression model. The p-value is 0.000 which is below the alpha level, therefore the model has statistical significance and the researcher can reject the null hypothesis and state that there is a relationship between the dependent variable and independent variables. The R-square value of .687 indicates there is a strong linear correlation.
Taking a look at the Coefficients output, the Variance Inflation Factor (VIF) requires that values close to 10 and above 10 indicate serious multicollinearity in the model and the independent variables have high levels of correlations between each other. The VIF values of 1.050 are below the 10.0 general rule and the researcher can assume that the assumption was met. The significance level of 0.000 is below the alpha level, therefore reject the null hypothesis and conclude that there is a relationship between the independent variables and dependent variable. The dummy variable was also statistically significant at the p < 0.05 level. The Coefficients output also indicates that being married has a predicted socioeconomic index of 1.806 units more than being widowed, divorced, separated, and never married.
The Residuals Statistics shows Cook’s Distance. Cases where the Cook’s distance is greater than 1 would be problematic. The values of Cook’s Distance are well below the value of 1.0, therefore the researcher can assume there is no undue influence in the model. The P-P Plot shows a bit of deviation from normality between the observed cumulative probabilities of 0.2 and 0.6 but it appears to be minor. There does not appear to be a severe problem with non-normality of residuals. The Scatter Plot shows no discernible pattern with the spread of scatter, and there is a linear relationship. The model does not seemingly violate the assumption of homoscedasticity.
The regression equation is as follows:
Respondent’s Socioeconomic Index = -14.089 + (1.358)Respondent’s occupational prestige score + (1.806)Marital Status: Married
Model Summaryb |
|||||
Model |
R |
R Square |
Adjusted R Square |
Std. Error of the Estimate |
Durbin-Watson |
1 |
.829a |
.687 |
.686 |
12.5464 |
1.854 |
a. Predictors: (Constant), Married, Rs occupational prestige score (2010) |
|||||
b. Dependent Variable: R’s socioeconomic index (2010) |
ANOVAa |
||||||
Model |
Sum of Squares |
df |
Mean Square |
F |
Sig. |
|
1 |
Regression |
836162.266 |
2 |
418081.133 |
2655.966 |
.000b |
Residual |
381566.922 |
2424 |
157.412 |
|||
Total |
1217729.188 |
2426 |
||||
a. Dependent Variable: R’s socioeconomic index (2010) |
||||||
b. Predictors: (Constant), Married, Rs occupational prestige score (2010) |
Coefficientsa |
||||||||
Model |
Unstandardized Coefficients |
Standardized Coefficients |
t |
Sig. |
Collinearity Statistics |
|||
B |
Std. Error |
Beta |
Tolerance |
VIF |
||||
1 |
(Constant) |
-14.089 |
.863 |
-16.319 |
.000 |
|||
Rs occupational prestige score (2010) |
1.358 |
.019 |
.819 |
70.307 |
.000 |
.953 |
1.050 |
|
Married |
1.806 |
.523 |
.040 |
3.451 |
.001 |
.953 |
1.050 |
|
a. Dependent Variable: R’s socioeconomic index (2010) |
Collinearity Diagnosticsa |
||||||
Model |
Dimension |
Eigenvalue |
Condition Index |
Variance Proportions |
||
(Constant) |
Rs occupational prestige score (2010) |
Married |
||||
1 |
1 |
2.561 |
1.000 |
.01 |
.01 |
.06 |
2 |
.395 |
2.547 |
.03 |
.03 |
.94 |
|
3 |
.044 |
7.595 |
.95 |
.96 |
.01 |
|
a. Dependent Variable: R’s socioeconomic index (2010) |
Residuals Statisticsa |
|||||
Minimum |
Maximum |
Mean |
Std. Deviation |
N |
|
Predicted Value |
7.633 |
96.325 |
46.019 |
18.5652 |
2427 |
Std. Predicted Value |
-2.068 |
2.710 |
.000 |
1.000 |
2427 |
Standard Error of Predicted Value |
.347 |
.830 |
.434 |
.081 |
2427 |
Adjusted Predicted Value |
7.600 |
96.338 |
46.018 |
18.5658 |
2427 |
Residual |
-34.7917 |
43.8698 |
.0000 |
12.5412 |
2427 |
Std. Residual |
-2.773 |
3.497 |
.000 |
1.000 |
2427 |
Stud. Residual |
-2.776 |
3.498 |
.000 |
1.000 |
2427 |
Deleted Residual |
-34.8634 |
43.9035 |
.0007 |
12.5558 |
2427 |
Stud. Deleted Residual |
-2.780 |
3.506 |
.000 |
1.001 |
2427 |
Mahal. Distance |
.855 |
9.621 |
1.999 |
1.230 |
2427 |
Cook’s Distance |
.000 |
.008 |
.000 |
.001 |
2427 |
Centered Leverage Value |
.000 |
.004 |
.001 |
.001 |
2427 |
a. Dependent Variable: R’s socioeconomic index (2010)” |
Classmate 2 (Melvin):
“Independent variables (IV1 and IV2) and their Level of Measurement.
The 2 independent variables and their level of measurement “Are YOU A CITIZEN OF AMERICA and Married
Dependent variable and its Level of Measurement.
The Dependent variable and its level of measurement is “Rs occupational prestige score (2010)”
Research Question
What is the relationship between are you a citizen of American, marital status, and respondents occupational prestige score (2010)?
Null Hypothesis
There is no relationship between are you a citizen of American, marital status, and respondents occupational prestige score (2010).
Research Design
This research design is multiple linear regression model using a dummy variable from one categorical variable and more than two groups for our independent variable. Dummy coding a variable means representing each of its values by a separate dichotomous variable. When using dummy variables in multiple linear regression models it is easier to code a categorical variable into multiple dichotomous variables, in which variables take the value of “1” or zero. Dichotomous variables are defined as variables that split or group data into 2 distinctive categories as with the Married; 1 = married and all others = 0. The model summary tables shows the linear regression model summary and the overall fit statistics. We can interpret the model as followed: 7% “of the variability in a respondent’s socioeconomic status index is explained by the combination of are you a citizen of American and marital status. We can conclude since R = .262 (26%) this makes it a weak linear relationship and the R² = .069 (7%), would make it a very weak linear relationship between are you a citizen of American, marital status, and respondents occupational prestige score (2010)? The Durbin-Watson = 1.819, which is between the two critical values of 0 and 4, which “provides us with some information about the independent errors (Laureate Education, Inc., 2016). Thus, we can conclude there is no first order linear autocorrelation in the sample.
The Anova tests the overall significance of the regression model. The Anova’s significance level is 0.000, thus we can reject the null hypothesis and conclude there is a linear relationshipbetween are you a citizen of American, marital status, and respondents occupational prestige score (2010) and both are you a citizen of American and marital status are statistically significant predictors of the respondents occupational prestige score (2010). The information in the coefficients table allows for us to check multicollinearity in our multiple linear regression model. The values should be >0.1 or VIF < 10 for all variables, which our values are 1.000, therefore we can assume that the assumptions has been met. The significance level is 0.000, thus we can reject the null hypothesis and conclude there is a linear relationship between are you a citizen of American, marital status, and respondents occupational prestige score (2010). The Dummy variable is also below 0.05, which indicates statistically significance with the coefficient output married, which predicted respondents occupational prestige score (2010). In the residuals Statistic table the data is .000, thus concluding there is no bias influence on the model. Looking at the normal p-p plot we can check for normality of residuals. The plot shows that the points generally follow the normal diagonal line with no strong deviations, which indicates that the residuals are normally distributed concluding the assumption has been met. The scatter plot give us the information on homoscedasticity and “weather our residuals at each level the predictor are equal in variance” (Laureate Education, Inc., 2016). There is no discernible pattern with the spread of scatter, which concluded our model did not violate heteroscedasticity. Scatter, again, tells us that there is a linear relationship, if not you would have seen a nonlinear relationship, in which the scatter perform a u shaped pattern (Laureate Education, Inc., 2016).
Regression Equation
Rs occupational prestige score (2010) = 47.420+ YOU A CITIZEN OF AMERICA (-6.242) + Married (6.478)
Rs occupational prestige score (2010) = 47.420+ (-6.242) + (6.478)
Variables Entered/Removeda |
||||||||||||
Model |
Variables Entered |
Variables Removed |
Method |
|||||||||
1 |
Married, ARE YOU A CITIZEN OF AMERICA?b |
. |
Enter |
|||||||||
a. Dependent Variable: Rs occupational prestige score (2010) |
||||||||||||
b. All requested variables entered. |
||||||||||||
Model Summaryb |
||||||||||||
Model |
R |
R Square |
Adjusted R Square |
Std. Error of the Estimate |
Durbin-Watson |
|||||||
1 |
.262a |
.069 |
.067 |
13.097 |
1.819 |
|||||||
a. Predictors: (Constant), Married, ARE YOU A CITIZEN OF AMERICA? |
||||||||||||
b. Dependent Variable: Rs occupational prestige score (2010) |
||||||||||||
ANOVAa |
||||||||||||
Model |
Sum of Squares |
df |
Mean Square |
F |
Sig. |
|||||||
1 |
Regression |
15335.223 |
2 |
7667.611 |
44.700 |
.000b |
||||||
Residual |
208416.971 |
1215 |
171.537 |
|||||||||
Total |
223752.194 |
1217 |
||||||||||
a. Dependent Variable: Rs occupational prestige score (2010) |
||||||||||||
b. Predictors: (Constant), Married, ARE YOU A CITIZEN OF AMERICA? |
||||||||||||
Coefficientsa |
|||||||||||||||
Model |
Unstandardized Coefficients |
Standardized Coefficients |
t |
Sig. |
Collinearity Statistics |
||||||||||
B |
Std. Error |
Beta |
Tolerance |
VIF |
|||||||||||
1 |
(Constant) |
47.420 |
1.741 |
27.237 |
.000 |
||||||||||
ARE YOU A CITIZEN OF AMERICA? |
-6.242 |
1.571 |
-.110 |
-3.973 |
.000 |
1.000 |
1.000 |
||||||||
Married |
6.478 |
.755 |
.238 |
8.580 |
.000 |
1.000 |
1.000 |
||||||||
a. Dependent Variable: Rs occupational prestige score (2010) |
|||||||||||||||
Collinearity Diagnosticsa |
|||||||||||||||
Model |
Dimension |
Eigenvalue |
Condition Index |
Variance Proportions |
|||||||||||
(Constant) |
ARE YOU A CITIZEN OF AMERICA? |
Married |
|||||||||||||
1 |
1 |
2.540 |
1.000 |
.01 |
.01 |
.06 |
|||||||||
2 |
.435 |
2.415 |
.01 |
.02 |
.93 |
||||||||||
3 |
.024 |
10.250 |
.98 |
.98 |
.01 |
||||||||||
a. Dependent Variable: Rs occupational prestige score (2010) |
|||||||||||||||
Residuals Statisticsa |
|||||||||||||||
Minimum |
Maximum |
Mean |
Std. Deviation |
N |
|||||||||||
Predicted Value |
34.94 |
47.66 |
43.69 |
3.550 |
1218 |
||||||||||
Residual |
-31.656 |
38.822 |
.000 |
13.086 |
1218 |
||||||||||
Std. Predicted Value |
-2.465 |
1.118 |
.000 |
1.000 |
1218 |
||||||||||
Std. Residual |
-2.417 |
2.964 |
.000 |
.999 |
1218 |
||||||||||
a. Dependent Variable: Rs occupational prestige score (2010) |
“ |