When variability varies: Heteroscedasticity and variance functions

Authors

  • Facundo J. Oddi Universidad Nacional de Río Negro. Instituto de Investigaciones en Recursos Naturales, Agroecología y Desarrollo Rural. San Carlos de Bariloche, Río Negro, Argentina. Consejo Nacional de investigaciones Científicas y Técnicas (CONICET). Instituto de Investigaciones en Recursos Naturales, Agroecología y Desarrollo Rural. San Carlos de Bariloche, Río Negro, Argentina.
  • Fernando E. Miguez Department of Agronomy, Iowa State University, Ames, IA, USA.
  • guido G. Benedetti Universidad Nacional de Río Negro. Instituto de Investigaciones en Recursos Naturales, Agroecología y Desarrollo Rural. San Carlos de Bariloche, Río Negro, Argentina.
  • Lucas A. Garibaldi Universidad Nacional de Río Negro. Instituto de Investigaciones en Recursos Naturales, Agroecología y Desarrollo Rural. San Carlos de Bariloche, Río Negro, Argentina. Consejo Nacional de investigaciones Científicas y Técnicas (CONICET). Instituto de Investigaciones en Recursos Naturales, Agroecología y Desarrollo Rural. San Carlos de Bariloche, Río Negro, Argentina.

DOI:

https://doi.org/10.25260/EA.20.30.3.0.1131

Keywords:

general linear models, generalized least squares, model selection, nested models, information criteria, AIC, function gls, R

Abstract

Variability is inherent to the world around us. Its quantification is essential to understand processes of interest in environmental and social sciences, such as adaptation of species to climate change or social inequality. Variance, one of the parameters of the normal distribution, is commonly used to quantify variability. Classical linear models assume that variance is constant (homoscedasticity assumption), while focusing only on changes in average trends. It is possible to extend classical models and relax the assumption of homoscedasticity through variance functions. However, these functions are scarcely used and we often lack examples in the Spanish-wri�en scientific literature. In this paper, we introduce variance functions in linear models from a theoretical-applied approach. We begin by introducing a real problem where heteroscedasticity is expected, which is accompanied by one simulated example. Subsequently, we formulate the classical linear model and discuss how it can be extended to model heteroscedasticity. Then, we explain some of the variance functions and apply them to the real case and the simulated data. We use the gls() function of the nlme package in R, and provide scripts that make data analyses reproducible. Additionally, we describe other options available in R for dealing with heteroscedastic data. We expect this paper will provide a guide for using variance functions and will expand the toolbox of scientists with basic statistical knowledge.

References

Bolker, B. 2008. Ecological models and data. Princeton University Press, Princeton.

Brazzale, A. R. 2005. hoa: An R package bundle for higher order likelihood inference. Rnews, 5/1 May 2005, 20-27. ISSN 609-3631. URL: www.r-project.org/doc/Rnews/Rnews_2005-1.pdf.

Dehn, J. 2000. The effects on growth of commodity price uncertainty and shocks. Policy Research Working Paper No. 2455. World Bank, Washington, DC. https://doi.org/10.1596/1813-9450-2455.

Fanelli, J. 2008. Macroeconomic volatility, institutions and financial architectures: the developing world experience. Palgrave Macmillan, London. https://doi.org/10.1057/9780230590182.

Garibaldi, L. A., F. Aristimuño, F. Oddi, and F. Tiribelli. 2017. Inferencia multimodelo en ciencias sociales y ambientales. Ecología Austral 27:348-363. https://doi.org/10.25260/EA.17.27.3.0.513.

Garibaldi, L. A., F. J. Oddi, F. Aristimuño, and A. N. Behnisch. 2019. Modelos estadísticos en lenguaje R. Editorial UNRN, Viedma.

Gałecki, A., and T. Burzykowski. 2013. Linear Mixed-Effects Models Using R. A Step-by-Step Approach. Springer Texts in Statistics, Springer Science+Business Media, New York. https://doi.org/10.1007/978-1-4614-3900-4.

Hallgrímsson, B., and B. Hall. 2005. Variation: a Central Concept in Biology. Elsevier Academic Press, Boston.

Howell, R. T., and C. J. Howell. 2008. The relation of economic status to subjective well-being in developing countries: A meta-analysis. Psychological Bulletin 134:536-560. https://doi.org/10.1037/0033-2909.134.4.536.

Kose, M. A., E. S. Prasad, and M. E. Terrones. 2003. Financial Integration and Macroeconomic Volatility. IMF Staff Papers 50(1). https://doi.org/10.5089/9781451846997.001.

Li, H., L. Squire, and H. F. Zou. 1998. Explaining international and intertemporal variations in income inequality. The Economic Journal 108:26-43. https://doi.org/10.1111/1468-0297.00271.

Loayza, N., and V. V. Hnatkovska. 2004. Volatility and Growth. Policy Research Working Paper No. 3184. World Bank, Washington, DC. https://doi.org/10.1596/1813-9450-3184.

Møller, A. P., and M. D. Jennions. 2002. How much variance can be explained by ecologists and evolutionary biologists? Oecologia 132:492-500. https://doi.org/10.1007/s00442-002-0952-2.

Nelder, J. A., and Wedderburn R. W. M. 1972. Generalized Linear Models. Journal of the Royal Statistical Society, Series A 135:370-384. https://doi.org/10.2307/2344614.

Neyman, J., and E. L. Scott. 1960. Correction for bias introduced by a transformation of variables. The Annals of Mathematical Statistics 31:643-655. https://doi.org/10.1214/aoms/1177705791.

Oddi, F. J., F. J. Aristimuño, C. Coulin, and L. A. Garibaldi. 2018. Ambigüedades en términos científicos: sobre el uso del “error” y el “sesgo” en estadística. Ecología Austral 28:525-536. https://doi.org/10.25260/EA.18.28.3.0.680.

Oddi, F. J., F. Miguez, L. Ghermandi, L. O. Bianchi, and L. A. Garibaldi. 2019. A nonlinear mixed-effects modelling approach for ecological data: Using temporal dynamics of vegetation moisture as an example. Ecology and Evolution 9:10225-10240. https://doi.org/10.1002/ece3.5543.

O’Hara, R. B., and D. J. Kotze. 2010. Do not log-transform count data. Methods in Ecology and Evolution 1:118-122. https://doi.org/10.1111/j.2041-210X.2010.00021.x.

Pinheiro, J. C., and D. M. Bates. 2000. Mixed‐effects models in S and SPLUS. Springer‐Verlag, New York. https://doi.org/10.1007/978-1-4419-0318-1.

Pinheiro, J. C., D. M. Bates, S. DebRoy, D. Sarkar, and R Core Team. 2016. _nlme: Linear and Nonlinear Mixed Effects Models_. R package version 3.1‐126. URL: CRAN.R-project.org/package=nlme.

Poorter, H., Ü. Niinemets, L. Poorter, I. J. Wright, and R. Villar. 2009. Causes and consequences of variation in leaf mass per area (LMA): a meta-analysis. New Phytologist 182:565-588. https://doi.org/10.1111/j.1469-8137.2009.02830.x.

Posthuma Partners. 2019. lmvar: Linear Regression with Non-Constant Variances. R package version 1.5.2. URL: CRAN.R-project.org/package=lmvar.

Quinn, G. P., and M. J. Keough. 2002. Experimental Design and Data Analysis for Biologists. Cambridge University Press, New York. https://doi.org/10.1017/CBO9780511806384.

R Core Team. 2018. R: A language and environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. URL: www.R-proje ct.org.

Ramey, G., and V. A. Ramey. 1995. Cross-country evidence on the link between volatility and growth (No. w4959). National bureau of economic research. The American Economic Review 85:1138-1151. https://doi.org/10.3386/w4959.

Schielzeth, H., and S. Nakagawa. 2013. Nested by design: Model fitting and interpretation in a mixed model era. Methods in Ecology and Evolution 4:14-24. https://doi.org/10.1111/j.2041-210x.2012.00251.x.

Shiller, R. J. 1989. Market volatility. MIT press, Cambridge.

Smyth, G. K. 2002. An efficient algorithm for REML in heteroscedastic regression (remlscore, randomizedBlock, and mixedModel2 functions). Journal of Computational and Graphical Statistics 11:836-847. https://doi.org/10.1198/106186002871.

Tagliamonte, S. A. 2006. Analysing Sociolinguistic Variation (Key Topics in Sociolinguistics). Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511801624.

Viechtbauer, W. 2010. Conducting meta-analyses in R with the metafor package. Journal of Statistical Software 36:1-48. URL: www.jstatsoft.org/v36/i03. https://doi.org/10.18637/jss.v036.i03.

Warton, D., and F. Hui. 2011. The arcsine is asinine: the analysis of proportions in ecology. Ecology 92:3-10. https://doi.org/10.1890/10-0340.1.

West, B. T., K. B. Welch, and A. Gałecki. 2014. Linear Mixed Models: A Practical Guide Using Statistical Software. Chapman and Hall, Boca Raton. https://doi.org/10.1201/b17198.

Whitham T. G., J. K. Bailey, J. A. Schweitzer, S. M. Shuster, R. K. Bangert, C. J. LeRoy, E. V. Lonsdorf, G. J. Allan, S. P. DiFazio, B. M. Potts, D. G. Fischer, C. A. Gehring, R. L. Lindroth, J. C. Marks, S. C. Hart, G. M. Wimp, and S. C. Wooley. 2006. A framework for community and ecosystem genetics: from genes to ecosystems. Nature Reviews Genetics 7:510-523. https://doi.org/10.1038/nrg1877.

Wolf, H. 2004. Accounting for consumption volatility differences. IMF Staff Papers 51:109-125.

Wolf, H. 2005. Volatility: Definitions and Consequences. Pp. 45-64 en J. Aizenman and B. Pinto (eds.). Managing Economic Volatility and Crises: A Practitioner's Guide. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511510755.004.

Xiao, X., E. P. White, M. B. Hooten, and S. L. Durham. 2011. On the use of log‐transformation vs. nonlinear regression for analyzing biological power laws. Ecology 92:1887-1894. https://doi.org/10.1890/11-0538.1.

Zuur, A. F., E. N. Ieno, N. Walker, A. A. Saveliev, and G. M. Smith. 2009. Mixed effects models and extensions in ecology with R. Springer, New York. https://doi.org/10.1007/978-0-387-87458-6.

Cuando la variabilidad varía: Heterocedasticidad y funciones de varianza

Published

2020-10-23

How to Cite

Oddi, F. J., Miguez, F. E., Benedetti, guido G., & Garibaldi, L. A. (2020). When variability varies: Heteroscedasticity and variance functions. Ecología Austral, 30(3), 438–453. https://doi.org/10.25260/EA.20.30.3.0.1131