Comparisons of Theil’s and Simple Regression on Normal and Non-Normal Data Set With Different Sample Sizes

This paper is on comparisons of Theil’s and simple regression on normal and non-normal data set with different sample sizes. Data used for this study were collected from a real life practical conducted by the researchers in their homes on the weight of soap and the number of days it had been used. Thus dependent variable(y) is weight (grams) of the soap and independent variable is the number of days (x). To know the efficiency of one method over the other, the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and Mean Square Error (MSE) were used. From the analysis, the result revealed that there is a significant relationship between dependent and independent variables for both the parametric OLS regression and non-parametric Theil’s regression with and without residual normality validity. Hence, an inverse relationship between x and y, that is as the number of days increase, weight of the soap decreases. It can be concluded that the parametric OLS regression performs better than its non-parametric Theil’s regression since their Residual standard error, AIC and BIC values are all smaller for both the normal and non-normal real data. The result of the real life data was used for data simulation of sample sizes of n = 30, 50, 100, 150, 200, 400, 500, 700, 900, 1000, and 1500, and the results revealed that the parametric OLS regression performs better than its non-parametric Theil’s regression since their EMS, AIC and BIC values are all smaller. It can be concluded that the regression line gave a good fit to the observed data since the line explains over 99% of the total variation of the Y values around their mean for both models. Even though the both models are good in this study, the OLS is more efficient. Therefore the researchers recommend that future research should look at a similar work with both high and low coefficient of variation of different sample sizes with normal and non-normal data, and also with more than one explanatory variable to examine the differences between the parametric and nonparametric Regression. Keywords - Theil’s Regression, Simple Regression, Anderson-Darling technique, Akaike Information Criterion, Bayesian Information Criterion, Error Mean Squares