Adaptive inference for multi-stage unbalanced exponential survey data


Two-stage sampling usually leads to higher variances for estimators of means andregression coecients, because of intra-cluster homogeneity. One way of allowing forclustering in tting a linear regression model is to use a linear mixed model with twolevels. If the estimated intra-cluster correlation is close to zero, it may be acceptableto ignore clustering and use a single level model. In this paper, an adaptive strategy isevaluated for estimating the variances of estimated regression coecients. The strategyis based on testing the null hypothesis that random eect variance component is zero. Ifthis hypothesis is accepted the estimated variances of estimated regression coecientsare extracted from the one-level linear model. Otherwise, the estimated variance isbased on the linear mixed model, or, alternatively the Huber-White robust varianceestimator is used. A simulation study is used to show that the adaptive approachprovides reasonably correct inference in a simple case.

DOI Code: 10.1285/i20705948v8n2p136

Keywords: Adaptive estimation, variance components, cluster sampling, multi-level models, Huber-White variance estimator, exponential distribution, unbalanced data


Al-Zou'bi, L. M., Clark, R. G., and Steel, D. G. (2010). Adaptive inference for multi-stage

survey data. Communications in Statistics - Simulation and Computation, 39(7):1334{ 1350.

Cherno, H. (1954). On the distribution of the likelihood ratio. The Annals of Mathematical Statistics, 25(3):573{578.

Faes, C., Molenberghs, H., Aerts, M., Verbeke, G., and Kenward, M. G. (2009). The effective sample size and an alternative small-sample degrees-of-freedom method. The American Statistician, 63(4):389{399.

Goldstein, H. (2003). Multilevel Statistical Models. Kendall's Library of Statistics 3. Arnold, London, third edition.

Huber, P. J. (1967). The behavior of maximum likelihood estimates under non-standard conditions. Proceedings of the Fifth Berekley Symposium on Mathematical Statistics and

Probability, University of California, Berekley, 11:221{233.

Kenward, M. G. and Roger, J. H. (1997). Small sample inference for xed eects from restricted maximum likelihood. Biometrics, 53(3):983{997.

Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73(1):13{22.

MacKinnon, J. G. and White, H. (1985). Some heteroskedasticity-consistent covariance matrix estimators with improved nite sample properties. Journal of Econometrics, 29:305{ 325.

Montgomery, D. C. and Runger, G. C. (2003). Applied statistics and probability for engineers. John Wiley and Sons, Inc., third edition.

Satterthwaite, F. E. (1941). Synthesis of variance. Psychometrika, 6:309{316.

Scheipl, F., Greven, S., and Kuchenho, H. (2007). Size and power of tests for a zero random eect variance or polynomial regression in additive and linear mixed models. Computational Statistics and Data Analysis, 52(7):3283{3299.

Scott, A. J. and Holt, D. (1982). The eect of two-stage sampling on ordinary least squares methods. Journal of the American Statistical Association, 77(380):848{854.

Self, S. G. and Liang, K. Y. (1987). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association, 82(398):605{610.

Stram, D. O. and Lee, J. W. (1994). Variance components testing in the longitudinal mixed effects model. Biometrics, 50(4):1171{1177.

West, B. T., Welch, K. B., and Galecki, A. T. (2007). Linear Mixed Model: A Practical Guide Using Statistical Software.

Chapman and Hall/CRC, Boca Raton, Florida.

White, H. (1982). Maximum likelihood estimation of misspecied models. Econometrica, 50(1):1{25.

Full Text: pdf

Creative Commons License
This work is licensed under a Creative Commons Attribuzione - Non commerciale - Non opere derivate 3.0 Italia License.