Authors: Xiuyi Yang
We reinterpret the training of convolutional neural nets(CNNs) with universal classification theorem(UCT). This theory implies any disjoint datasets can be classified by two or more layers of CNNs based on ReLUs and rigid transformation switch units(RTSUs) we propose here, this explains why CNNs could memorize noise and real data. Subsequently, we present another fresh new hypothesis that CNN is insensitive to some variant from input training data example, this variant relates to original training input by generating functions. This hypothesis means CNNs can generalize well even for randomly generated training data and illuminates the paradox Why CNNs fit real and noise data and fail drastically when making predictions for noise data. Our findings suggest the study about generalization theory of CNNs should turn to generating functions instead of traditional statistics machine learning theory based on assumption that the training data and testing data are independent and identically distributed(IID), and apparently IID assumption contradicts our experiments in this paper.We experimentally verify these ideas correspondingly.
Comments: 7 Pages.
[v1] 2017-11-06 20:27:28
Unique-IP document downloads: 17 times
Vixra.org is a pre-print repository rather than a journal. Articles hosted may not yet have been verified by peer-review and should be treated as preliminary. In particular, anything that appears to include financial or legal advice or proposed medical treatments should be treated with due caution. Vixra.org will not be responsible for any consequences of actions that result from any form of use of any documents on this website.
Add your own feedback and questions here:
You are equally welcome to be positive or negative about any paper but please be polite. If you are being critical you must mention at least one specific error, otherwise your comment will be deleted as unhelpful.