Techniques to avoid pitfalls in empirical modeling

Edward Wojciechowski, David A. Vaccari

Research output: Contribution to journalConference articlepeer-review

Abstract

The development of a mathematical model that adequately captures and describes the interactions among the various system components is critical to the understanding and control of physical, chemical or biological phenomena. This often involves developing a multivariate model that will be used to forecast future events. Once the model has been proposed, it must be validated to check its adequacy in terms of its ability to forecast future events. However, such empirical models are subject to a number of pitfalls including overfitting, chance correlation, extrapolation, and lack of parsimony. In this paper, we describe the application of techniques to avoid these problems. The techniques described here are stratified data sampling, cross-validation, summed independent variables, and the use constraints to model complexity. Although most of these techniques can be applied to any type of data model (e.g. linear, polynomial, non-linear, artificial neural network, etc.), we have studied their application for polynomial autoregressive models with exogenous variables (e.g. PARX). By using these techniques we are able to validate parsimonious models with reduced risk of overfitting, extrapolation, or chance correlation. As applied to PARX models we were able to develop higher order polynomials which significantly reduce forecast errors over traditional linear, autoregressive models.

Original languageEnglish
JournalSAE Technical Papers
DOIs
StatePublished - 1999
Event29th International Conference on Environmental Systems - Denver, CO, United States
Duration: 12 Jul 199915 Jul 1999

Fingerprint

Dive into the research topics of 'Techniques to avoid pitfalls in empirical modeling'. Together they form a unique fingerprint.

Cite this