Statistics

2012 Submissions

[12] viXra:2012.0221 [pdf] submitted on 2020-12-30 12:07:53

The Continuous Bernoulli Approaching Distribution When λ → 0 and the Continuous Binomial Distribution

Authors: Kuan-Shian Wang, Mei-Yu Lee
Comments: 37 Pages. [Corrections are made by viXra Admin to comply with the rules of viXra.org]

We provide the mathematical deduction and numerical explanations to verify that as λ → 0, the continuous Bernoulli approximates to the exponential distribution in Chapter 1 and as λ → 0 and λ → 1, the continuous binomial distribution will approximate to Gamma distribution in Chapter 3. Meanwhile, Chapter 2 describes how to compute the continuous Binomial distribution which can be derived by the continuous Bernoulli.
Category: Statistics

[11] viXra:2012.0088 [pdf] submitted on 2020-12-12 09:51:59

Continuous Bernoulli Distribution-Simulator and Test Statistic

Authors: Kuan-Sian Wang, Mei-Yu Lee
Comments: Pages.

We discussed the simulator and test statistic of continuous Bernoulli distribution which is important to test the pervasive error of variational autoencoders in deep learning. We provided the sufficient statistic, the point estimator, the confidence interval, test statistic, goodness of fit, and one-way test for continuous Bernoulli distribution. Besides, continuous binomial distribution can be derived, so the the confidence interval and the test can be worked under two continuous Bernoulli populations. Continuous trinomial distribution can also be find. Please download the computer software of this book from https://github.com/meiyulee/continuous_Bernoulli
Category: Statistics

[10] viXra:2012.0044 [pdf] submitted on 2020-12-07 13:36:14

On a Linnik Theorem in Theory of Errors

Authors: Abdelmajid Ben Hadj Salem
Comments: 7 Pages. In French.

In this note, we give a proof of a theorem of Linnik concerning the theory of errors, stated in his book "Least squares method and the mathematical bases of the statistical theory of the treatment of observations", without proof.
Category: Statistics

[9] viXra:2012.0038 [pdf] submitted on 2020-12-06 14:50:31

Automatic Emulator and Optimized Look-up Table Generation for Radiative Transfer Models

Authors: L. Martino, J. Vicent, G. Camps-Valls
Comments: 5 Pages.

This paper introduces an automatic methodology to construct emulators for costly radiative transfer models (RTMs). The proposed method is sequential and adaptive, and it is based on the notion of the acquisition function by which instead of optimizing the unknown RTM underlying function we propose to achieve accurate approximations. The Automatic Gaussian Process Emulator (AGAPE) methodology combines the interpolation capabilities of Gaussian processes (GPs) with the accurate design of an acquisition function that favors sampling in low density regions and flatness of the interpolation function. We illustrate the good capabilities of the method in toy examples and for the construction of an optimal look-up-table for atmospheric correction based on MODTRAN5.
Category: Statistics

[8] viXra:2012.0037 [pdf] submitted on 2020-12-06 19:04:22

Adaptive Sequential Interpolator Using Active Learning for Efficient Emulation of Complex Systems

Authors: L.Martino, D. Heestermans Svendsen, J. Vicent, G. Camps-Valls
Comments: 5 Pages.

Many fields of science and engineering require the use of complex and computationally expensive models to understand the involved processes in the system of interest. Nevertheless, due to the high cost involved, the required study becomes a cumbersome process. This paper introduces an interpolation procedure which belongs to the family of active learning algorithms, in order to construct cheap surrogate models of such costly complex systems. The proposed technique is sequential and adaptive, and is based on the optimization of a suitable acquisition function. We illustrate its efficiency in a toy example and for the construction of an emulator of an atmosphere modeling system.
Category: Statistics

[7] viXra:2012.0036 [pdf] submitted on 2020-12-06 19:06:41

Particle Group Metropolis Methods for Tracking the Leaf Area Index

Authors: L. Martino, V. Elvira, G. Camps-Valls
Comments: 5 Pages.

Monte Carlo (MC) algorithms are widely used for Bayesian inference in statistics, signal processing, and machine learning. In this work, we introduce an Markov Chain Monte Carlo (MCMC) technique driven by a particle filter. The resulting scheme is a generalization of the so-called Particle Metropolis-Hastings (PMH) method, where a suitable Markov chain of sets of weighted samples is generated. We also introduce a marginal version for the goal of jointly inferring dynamic and static variables. The proposed algorithms outperform the corresponding standard PMH schemes, as shown by numerical experiments.
Category: Statistics

[6] viXra:2012.0035 [pdf] submitted on 2020-12-06 15:16:02

Group Metropolis Sampling

Authors: L. Martino, V. Elvira, G. Camps-Valls
Comments: 5 Pages.

Monte Carlo (MC) methods are widely used for Bayesian inference and optimization in statistics, signal processing and machine learning. Two well-known class of MC methods are the Importance Sampling (IS) techniques and the Markov Chain Monte Carlo (MCMC) algorithms. In this work, we introduce the Group Importance Sampling (GIS) framework where different sets of weighted samples are properly summarized with one summary particle and one summary weight. GIS facilitates the design of novel efficient MC techniques. For instance, we present the Group Metropolis Sampling (GMS) algorithm which produces a Markov chain of sets of weighted samples. GMS in general outperforms other multiple try schemes as shown by means of numerical simulations.
Category: Statistics

[5] viXra:2012.0034 [pdf] submitted on 2020-12-05 11:18:45

Joint Gaussian Processes for Inverse Modeling

Authors: D. Heestermans Svendsen, L. Martino, M. Campos-Taberner, G. Camps-Valls
Comments: 5 Pages.

Solving inverse problems is central in geosciences and remote sensing. Very often a mechanistic physical model of the system exists that solves the forward problem. Inverting the implied radiative transfer model (RTM) equations numerically implies, however, challenging and computationally demanding problems. Statistical models tackle the inverse problem and predict the biophysical parameter of interest from radiance data, exploiting either in situ data or simulated data from an RTM. We introduce a novel nonlinear and nonparametric statistical inversion model which incorporates both real observations and RTM-simulated data. The proposed Joint Gaussian Process (JGP) provides a solid framework for exploiting the regularities between the two types of data, in order to perform inverse modeling. Advantages of the JGP method over competing strategies are shown on both a simple toy example and in leaf area index (LAI) retrieval from Landsat data combined with simulated data generated by the PROSAIL model.
Category: Statistics

[4] viXra:2012.0033 [pdf] submitted on 2020-12-05 11:25:51

Distributed Particle Metropolis-Hastings Schemes

Authors: L. Martino, V. Elvira, G. Camps-Valls
Comments: 5 Pages.

We introduce a Particle Metropolis-Hastings algorithm driven by several parallel particle filters. The communication with the central node requires the transmission of only a set of weighted samples, one per filter. Furthermore, the marginal version of the previous scheme, called Distributed Particle Marginal Metropolis-Hastings (DPMMH) method, is also presented. DPMMH can be used for making inference on both a dynamical and static variable of interest. The ergodicity is guaranteed, and numerical simulations show the advantages of the novel schemes.
Category: Statistics

[3] viXra:2012.0032 [pdf] submitted on 2020-12-05 22:19:11

Probabilistic Cross-Validation Estimators for Gaussian Process Regression

Authors: L. Martino, V. Laparra, G. Camps-Valls
Comments: 5 Pages.

Gaussian Processes (GPs) are state-of-the-art tools for regression. Inference of GP hyperparameters is typically done by maximizing the marginal log-likelihood (ML). If the data truly follows the GP model, using the ML approach is optimal and computationally efficient. Unfortunately very often this is not case and suboptimal results are obtained in terms of prediction error. Alternative procedures such as cross-validation (CV) schemes are often employed instead, but they usually incur in high computational costs. We propose a probabilistic version of CV (PCV) based on two different model pieces in order to reduce the dependence on a specific model choice. PCV presents the benefits from both approaches, and allows us to find the solution for either the maximum a posteriori (MAP) or the Minimum Mean Square Error (MMSE) estimators. Experiments in controlled situations reveal that the PCV solution outperforms ML for both estimators, and that PCV-MMSE results outperforms other traditional approaches.
Category: Statistics

[2] viXra:2012.0031 [pdf] submitted on 2020-12-05 22:21:01

Recycling Gibbs Sampling

Authors: L. Martino, V. Elvira, G. Camps-Valls
Comments: 5 Pages.

Gibbs sampling is a well-known Markov chain Monte Carlo (MCMC) algorithm, extensively used in signal processing, machine learning and statistics. The key point for the successful application of the Gibbs sampler is the ability to draw samples from the full-conditional probability density functions efficiently. In the general case this is not possible, so in order to speed up the convergence of the chain, it is required to generate auxiliary samples. However, such intermediate information is finally disregarded. In this work, we show that these auxiliary samples can be recycled within the Gibbs estimators, improving their efficiency with no extra cost. Theoretical and exhaustive numerical comparisons show the validity of the approach.
Category: Statistics

[1] viXra:2012.0030 [pdf] submitted on 2020-12-05 22:23:48

Multioutput Automatic Emulator for Radiative Transfer Models

Authors: D. Heestermans Svendsen, L. Martino, J. Vicent, G. Camps-Valls
Comments: 4 Pages.

This paper introduces a methodology to construct emulators of costly radiative transfer models (RTMs). The proposed methodology is sequential and adaptive, and it is based on the notion of acquisition functions in Bayesian optimization. Here, instead of optimizing the unknown underlying RTM function, one aims to achieve accurate approximations. The Automatic Multi-Output Gaussian Process Emulator (AMOGAPE) methodology combines the interpolation capabilities of Gaussian processes (GPs) with the accurate design of an acquisition function that favors sampling in low density regions and flatness of the interpolation function. We illustrate the promising capabilities of the method for the construction of an emulator for a standard leaf-canopy RTM.
Category: Statistics