Inferencia multimodelo en ciencias sociales y ambientales

Lucas A. Garibaldi; Francisco J. Aristimuño; Facundo J. Oddi; Florencia Tiribelli

doi:10.25260/EA.17.27.3.0.513

Authors

Lucas A. Garibaldi Instituto de Investigaciones en Recursos Naturales, Agroecología y Desarrollo Rural (IRNAD), Sede Andina, Universidad Nacional de Río Negro (UNRN) y Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), San Carlos de Bariloche, Río Negro, Argentina.
Francisco J. Aristimuño Centro de Estudios en Ciencia, Tecnología, Cultura y Desarrollo (CITECDE), Sede Andina, Universidad Nacional de Río Negro (UNRN) y Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), San Carlos de Bariloche, Río Negro, Argentina.
Facundo J. Oddi Instituto de Investigaciones en Recursos Naturales, Agroecología y Desarrollo Rural (IRNAD), Sede Andina, Universidad Nacional de Río Negro (UNRN) y Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), San Carlos de Bariloche, Río Negro, Argentina.
Florencia Tiribelli Instituto de Investigaciones en Biodiversidad y Medioambiente (INIBIOMA), Universidad Nacional del Comahue (UNCo) y Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), San Carlos de Bariloche, Río Negro, Argentina.

DOI:

https://doi.org/10.25260/EA.17.27.3.0.513

Abstract

Professionals of the social and environmental sciences must solve problems (answer questions) based on data sampling and analyses. Commonly, all professionals face similar challenges: they need to take decisions on a population (e.g., all the trees of a region), but only have data from a sample (some trees of that region). A key tool in this process is to propose population models for the response variable (tree growth as a function of tree age and climatic conditions) and then use model predictions to take decisions (e.g., when to cut trees according to climatic conditions). In this paper we discuss how to propose, estimate, and select models of a population based on sampling data. We put special emphasis in proposing several alternative models (hypotheses) to solve one problem (e.g., different tree growth functions for age), which must be proposed before data sampling, including a null model (tree growth does not depend on tree age or climatic conditions). Models guide us on how data must be sampled for a valid contrast (growth measurements in trees of different age and under contrasting climates). Then, the Akaike information criterion (AIC) can be employed to sort the most parsimonious models, selecting those with the best goodness of fit (likelihood) and the lowest number of parameters (model complexity). Along the text, we introduce basic notions of multimodel inference and discuss common user mistakes. We provide real examples, and share their data and the analyses code in R, a free and open source software. In addition to be useful to professionals from different sciences, we expect our paper to promote the teaching of multimodel inference in graduate courses.

DOI: https://doi.org/10.25260/EA.17.27.3.0.513

Author Biography

Lucas A. Garibaldi, Instituto de Investigaciones en Recursos Naturales, Agroecología y Desarrollo Rural (IRNAD), Sede Andina, Universidad Nacional de Río Negro (UNRN) y Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), San Carlos de Bariloche, Río Negro, Argentina.

Dr. Lucas A. Garibaldi. Director - IRNAD. Profesor asociado - UNRN. Investigador independiente - CONICET.

References

Aho, K., D. Derryberry, and T. Peterson. 2014. Model selection for ecologists: the worldviews of AIC and BIC. Ecology 95:631-636.

Anderson, D. R., D. J. Sweeney, and T. A. Williams. 2011. Estadística para negocios y economía. 11a. ed. Cengage Learning, Distrito Federal, México.

Arlot, S., and A. Celisse. 2010. A survey of cross-validation procedures for model selection. Statistics Surveys 4:40-79.

Arrow, K. J., H. B. Chenery, B. S. Minhas, and R. M. Solow. 1961. Capital-Labor Substitution and Economic Efficiency. The Review of Economics and Statistics 43:225-250.

Brown, J. E., H. A. Fitzhugh, and T. C. Cartwright. 1976. A comparison of nonlinear models for describing weight-age relationships in cattle. Journal of Animal Science 42:810-818.

Burnham, K. P., and D. R. Anderson. 2014. P values are only an index to evidence: 20th- vs. 21st-century statistical science. Ecology 95:627-630.

Burnham, K. P., D. R. Anderson, and K. P. Huyvaert. 2011. AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behavioral Ecology and Sociobiology 65:23-35.

Burnham, K. P., and R. P. Anderson. 2004. Multimodel inference: understanding AIC and BIC in model selection. Sociological Methods and Research 33:261-304.

Casas, G. A., D. Rodríguez, and G. Afanador Téllez. 2010. Propiedades matemáticas del modelo de Gompertz y su aplicación al crecimiento de los cerdos. Revista Colombiana de Ciencias Pecuarias 23:349-358.

Chamberlin, T. C. 1890. The method of multiple working hypotheses. Science (New York, N.Y.) 15:92-96.

Chiang, A. C., and K. Wainwright. 2006. Métodos fundamentales de economía matemática. Cuarta edi. McGraw-Hill Interamericana, México, D. F.

Cobb, C. W., and P. H. Douglas. 1928. A theory of production. The American Economic Review 18:139-165.

Cox, D. R. 2006. Principles of statistical inference. Cambridge University Press, Cambridge, UK.

Crespo, E., A. C. Schiavini, G. Pérez Macri, L. Reyes, and S. L. Dans. 1994. Estudios sobre la determinación de edad en mamíferos marinos del Atlántico Sudoccidental. Pages 31-55 in J. A. Oporto (ed.). Anales de la 4a Reunión de Trabajo de Especialistas en Mamíferos Acuáticos de América del Sur. Valdivia, Chile.

Evans, M. R., V. Grimm, K. Johst, T. Knuuttila, R. de Langhe, C. M. Lessells, M. Merz, M. A. O’Malley, S. H. Orzack, M. Weisberg, D. J. Wilkinson, O. Wolkenhauer, and T. G. Benton. 2013. Do simple models lead to generality in ecology? Trends in Ecology and Evolution 28:578-583.

FAO. 2014. FAOSTAT. http://faostat.fao.org/site/377/default.aspx#ancor.

Fernández, S., and A. A. Hohn. 1998. Age, growth, and calving season of bottlenose dolphins, Tursiops truncatus off coastal Texas. Fishery Bulletin 96:357-365.

Grueber, C. E., S. Nakagawa, R. J. Laws, and I. G. Jamieson. 2011. Multimodel inference in ecology and evolution: Challenges and solutions. Journal of Evolutionary Biology 24:699-711.

Hacking, I. 2006. The emergence of probability: a philosophical study of early ideas about probability, induction and statistical inference. Second Edi. Cambridge University Press, Cambridge, UK.

Hald, A. 2007. A history of parametric statistical inference from Bernoulli to Fisher, 1713-1935. Springer-Verlag, New York, USA.

Hobbs, N. T., and R. Hilborn. 2006. Alternatives to statistical hypothesis testing in ecology: a guide to self-teaching. Ecological Applications 16:5-19.

Hubbard, R., and M. J. Bayarri. 2003. Confusion over measures of evidence (p’s) versus errors (α’s) in classical statistical testing. The American Statistician 57:171-178.

Hulbert, S. H. 1984. Pseudoreplication and the design of ecological field experiments. Ecological Monographs 54:187-211.

Laco Mazzone, F., M. Grampa, M. Goldenberg, F. Aristimuño, F. Oddi, and L. A. Garibaldi. 2016. Declaración de la Asociación de Estadística Americana sobre la significancia estadística y los valores P (editado por Ronald L. Wasserstein). The American Statistician 70:Online discussion.

Murtaugh, P. A. 2014. In defense of P values. Ecology 95:611-617.

Nobre, P. R. C., I. Misztal, S. Tsuruta, J. K. Bertrand, L. O. C. Silva, and P. S. Lopes. 2003. Analyses of growth curves of Nellore cattle by multiple-trait and random regression models. Journal of Animal Science 81:918-926.

Oltjen, J. W., A. C. Bywater, R. L. Baldwin, and W. N. Garrett. 1986. Development of a dynamic model of beef cattle growth and composition. Journal of Animal Science 62:86-97.

Pérez-Planells, L., J. Delegido, J. P. Rivera-Caicedo, and J. Verrelst. 2015. Análisis de métodos de validación cruzada para la obtención robusta de parámetros biofísicos. Revista de Teledeteccion 44:55-65.

Popper, K. R. 1980. La lógica de la investigación científica. Page Estructura y Función. El porvenir de la Ciencia. Editorial Tecnos S. A. Editorial Tecnos, Madrid, España.

R Core Team. 2016. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

Ratkowsky, D. A. 1983. Nonlinear regression modeling: a unified practical approach. Marcel Dekker Inc, New York, USA.

Rosen, J. 2016. A forest of hypotheses. Nature 536:239-241.

Stanton-Geddes, J., C. G. De Freitas, and C. De Sales Dambros. 2014. In defense of P values: comment on the statistical methods actually used by ecologists. Ecology 95:637-642.

Todhunter, I. 1865. A history of the mathematical theory of probability from the time of Pascal to that of Laplace. Macmilland and Co, Cambridge and London.