Statistics

2510 Submissions

[5] viXra:2510.0077 [pdf] replaced on 2025-12-20 02:02:16

Monty-hall Theorem Bayes-price Rule (Bayes Theorem) for a Three Parameter Event Space

Authors: Keshava Prasad Halemane
Comments: 11 Pages. 2 Tables

This research report presents the statement of the Monty-Hall Theorem and provides a constructive proof by solving the classical Monty-Hall Problem. It establishes the fact that the probability of winning the prize is indeed unaffected by a switched-choice — very much unlike the most prevalent and widely accepted position held by the Leading Subject-Matter-Experts.
Category: Statistics

[4] viXra:2510.0059 [pdf] submitted on 2025-10-12 10:04:53

Arguments in Favor of the Berger-Parker Index as an Effective Sample Size: The Only True Particle Counter

Authors: L. Martino
Comments: 13 Pages.

In many fields, including computational statistics, ecology, economics, and physics, normalized weights define a discrete probability mass function over a set of entities/samples. The effective sample size (ESS) quantifies the concentration of these weights, providing a measure of sample representativeness. In this work, we show that, among various ESS formulations, the Berger-Parker index uniquely preserves the relative proportions of the weights, acting as a true particle counter. Other commonly used ESS expressions tend to overestimate the effective sample size when only normalized weights are considered. Several examples and formal demonstration are provided.
Category: Statistics

[3] viXra:2510.0016 [pdf] submitted on 2025-10-04 09:38:22

Consensus in Sequential Wrapper Feature Selection: a Unifying Approach

Authors: L. Martino, G. Villacrés, S. Arcidiacono
Comments: 12 Pages.

Feature selection is a crucial task in statistics and machine learning, with direct implications for model interpretability and computational efficiency. This study introduces aunifying approach that combines the four possible sequential wrapper methods employedfor variable selection, aiming to exploit their complementary strengths. The proposed procedure computes feature relevance scores and, subsequently, integrates the outputs from each sequential wrapper method. The underlying idea is simple and efficient. We test it in a controlled experiment with a known ground truth. The results indicate that the ranking obtained by consensus clearly outperform the individual rankings obtained by the wrapper methods.
Category: Statistics

[2] viXra:2510.0015 [pdf] submitted on 2025-10-04 10:00:17

Data-Driven Priors Via Hyper-Parameter Posteriors of Gaussian Processes

Authors: L. Martino, J. Lopez-Santiago, J. Miguez, G. Vazquez-Vilar
Comments: 26 Pages.

When neither prior knowledge nor expert opinion is available, non-informative priors provide a practical alternative for conducting Bayesian inference. However, in the context of model selection, genuinely non-informative priors do not exist. In fact, diu2000use priors on the parameters can drastically alter the value of the Bayesian evidence, making them effectively highly informative, while improper priors are even not allowed. Furthermore, in many real-worldapplications, the use of informative priors can substantially improve the computational efficiency by driving sampling algorithms toward regions of high posterior probability. In this work, we introduce a data-driven procedure for an automatic prior construction. The underlying idea is to exploit the posteriors of the hyper-parameters from non-parametric models, to construct priors for Bayesian inference in parametric models. We test the proposed scheme in four different experiments, two of which involve real astronomical data.
Category: Statistics

[1] viXra:2510.0001 [pdf] replaced on 2026-04-18 15:38:05

Sampling from Mixtures with Negative Weights: Application to Density Approximation by Gaussian Processes

Authors: Luca Martino
Comments: 24 Pages.

In this work, we focus on mixtures with negative coefficients and their applications in computational statistics. Mixtures of probability densities are widely used in statistics and machine learning. While classical mixtures restrict weights to be non-negative, allowing negative weights enables more flexible density approximation. However, negative weights introduce challenges in handling and sampling such distributions. For this purpose, we propose efficient Monte Carlo (MC) methods (including MC quadratures, rejection sampling and importance sampling schemes) for computing integrals and generating samples from these mixtures. A tailored proposal density ensures accurate and efficient generation of (unweighted) samples. Furthermore,we introduce an IS scheme which employs a mixture with negative coefficients as a proposal density, yielding samples with both positive and negative importance weights. Applications in Gaussian process-based density estimation demonstrate the practical relevance and efficiency of proposed schemes. An adaptive importance sampling procedure based on GP-regression is also proposed. The numerical results provide clear empirical evidence of the accuracy and computational efficiency of the proposed methods.
Category: Statistics