[2] viXra:2404.0105 [pdf] submitted on 2024-04-21 12:12:42
Authors: L. Martino, E. Morgado, R. San Millan Castillo
Comments: 20 Pages.
An index of effective number of variables (ENV) is introduced for model selection in nested models. This is the case, for instance, when we have to decide the order of a polynomial function or the number of bases in a nonlinear regression, or choose the number of clusters in a clustering problem, or the number of feature in a variable selection application (to name few examples). It is inspired by the concept of maximum area under the curve (AUC) idea and the Gini index. The interpretation of the ENV index is identical to the effective sample size (ESS) indices with respect to a set of samples. The ENV index improves some drawback the elbow detectors described in the literature, and introduces different measures of uncertainty and reliability of the proposed solution. These novel reliability measures can be employed also jointly with the use different information criteria such as the well-known AIC and BIC. Comparisons with classical and recent schemes are provided in different experiments involving real datasets. Related Matlab code is given.
Category: Statistics
[1] viXra:2404.0064 [pdf] submitted on 2024-04-13 20:44:31
Authors: Mathis Antonetti
Comments: 3 Pages.
In this note, we establish a uniform lower bound (w.r.t. the number of players) for the probability of k players tied for first place in the geometric case. To derive this bound, we introduce the concept of supertelescoping series as a generalization of telescoping series. We also provide an insight on the relationship between supertelescopic series and supermartingales.
Category: Statistics