Authors: Antony Van der Mude
Overfitting has always been a problem in machine learning. Recently a related phenomenon called “oversearching” has been analyzed. This paper takes a theoretical approach using a very general methodology covering most learning paradigms in current use. Overfitting is defined in terms of the “expressive accuracy” of a model for the data, rather than “predictive accuracy”. The results show that even if the learner can identify a set of best models, overfitting will cause it to bounce from one model to another. Overfitting is ameliorated by having the learner bound the search space, and bounding is equivalent to using an accuracy (or bias) more restrictive than the problem accuracy. Also, Ramsey’s Theorem shows that every data sequence has an situation where either consistent overfitting or underfitting is unavoidable. We show that oversearching is simply overfitting where the resource used to express a model is the search space itself rather than a more common resource such as a program that executes the model. We show that the smallest data sequence guessing a model defines a canonical resource. There is an equivalence in the limit between any two resources to express the same model space, but it may not be effectively computable.
Comments: 13 Pages.
[v1] 2017-02-23 18:45:36
Unique-IP document downloads: 14 times
Add your own feedback and questions here:
You are equally welcome to be positive or negative about any paper but please be polite. If you are being critical you must mention at least one specific error, otherwise your comment will be deleted as unhelpful.