Heteroscedasticity in survey data and model selection based on weighted Schwarz bayesian information criteria


Abstract


This paper proposed Weighted Schwarz Bayesian Information criteria for the purpose of selecting a best model from various competing models, when heteroscedasticity is present in the survey data. The authors found that the information loss between the true model and fitted models are equally weighted, instead of giving unequal weights. The computation of weights purely depends on the differential entropy of each sample observation and traditional Schwarz Bayesian information criteria was penalized by the weight function which comprised of the Inverse variance to mean ratio (VMR) of the fitted log quantiles. The weighted Schwarz Bayesian information criteria was proposed in two versions based on the nature of the estimated error variances of the model namely Homogeneous and Heterogeneous WSBIC respectively. The proposed WSBIC outperforms the traditional information criteria of model selection and it leads to conduct a logical statistical treatment for selecting a best model. Finally this procedure was numerically illustrated by fitting 12 different types of stepwise regression models based on 44 independent variables in a BSQ (Bank service Quality) study.

DOI Code: 10.1285/i20705948v7n2p199

Keywords: Schwarz Bayesian information criteria; Weighted Schwarz Bayesian information criteria; Differential entropy ;log-quantiles; Variance to mean ratio

References


.Andrew Barron, Lucien Birg´e, and Pascal Massart (1999) Risk bounds for model selection via penalization. Probab. Theory Related Fields, 113(3):301–413.

.Boris T. Polyak and A. B. Tsybakov (1990) Asymptotic optimality of the Cp-test in the projection estimation of a regression. Teor. Veroyatnost. i Primenen., 35(2):305–317.

.Colin L. Mallows (1973) Some comments on Cp. Technometrics, 15:661–675.

.Hirotugu Akaike(1970) Statistical predictor identification. Ann. Inst. Statist. Math., 22:203–217.

.Hirotugu Akaike(1973) Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory (Tsahkadsor, 1971), pages 267–281.

.Ker-Chau Li (1987) Asymptotic optimality for Cp, CL, cross-validation and generalized crossvalidation: discrete index set. Ann. Statist., 15(3):958–975.

.Lucien Birg´e and Pascal Massart (2007) Minimal penalties for Gaussian model selection. Probab. Theory Related Fields, 138(1-2):33–73.

.Myers, R. H,Montgomery, D,C. (1997),“A Tutorial on Generalized Linear Models”, Journal of Quality Technology 29, 274-291.

.Ritei Shibata (1981) An optimal selection of regression variables. Biometrika, 68(1):45–54.

.Sylvain Arlot (2009). Model selection by resampling penalization. Electron. J. Stat., 3:557–624.

.Xavier Gendre (2008) Simultaneous estimation of the mean and the variance in heteroscedastic Gaussian regression. Electron. J. Stat., 2:1345–1372.

.Yannick Baraud (2000) Model selection for regression on a fixed design. Probab. Theory Related Fields, 117(4):467–493.

.Yannick Baraud (2002) Model selection for regression on a random design. ESAIM Probab. Statist., 6:127–146 (electronic).


Full Text: pdf


Creative Commons License
This work is licensed under a Creative Commons Attribuzione - Non commerciale - Non opere derivate 3.0 Italia License.