Statistics

1601 Submissions

[6] viXra:1601.0179 [pdf] submitted on 2016-01-16 22:40:19

Efficient Linear Fusion of Distributed MMSE Estimators for Big Data

Authors: D. Luengo, L. Martino, V. Elvira, M. Bugallo
Comments: 22 Pages.

Many signal processing applications require performing statistical inference on large datasets, where computational and/or memory restrictions become an issue. In this big data setting, computing an exact global centralized estimator is often unfeasible. Furthermore, even when approximate numerical solutions (e.g., based on Monte Carlo methods) working directly on the whole dataset can be computed, they may not provide a satisfactory performance either. Hence, several authors have recently started considering distributed inference approaches, where the data is divided among multiple workers (cores, machines or a combination of both). The computations are then performed in parallel and the resulting distributed or partial estimators are finally combined to approximate the intractable global estimator. In this paper, we focus on the scenario where no communication exists among the workers, deriving efficient linear fusion rules for the combination of the distributed estimators. Both a Bayesian perspective (based on the Bernstein-von Mises theorem and the asymptotic normality of the estimators) and a constrained optimization view are provided for the derivation of the linear fusion rules proposed. We concentrate on minimum mean squared error (MMSE) partial estimators, but the approach is more general and can be used to combine any kind of distributed estimators as long as they are unbiased. Numerical results show the good performance of the algorithms developed, both in simple problems where analytical expressions can be obtained for the distributed MMSE estimators, and in a wireless sensor network localization problem where Monte Carlo methods are used to approximate the partial estimators.
Category: Statistics

[5] viXra:1601.0174 [pdf] replaced on 2016-07-15 02:12:10

Improving Population Monte Carlo: Alternative Weighting and Resampling Schemes

Authors: V. Elvira, L. Martino, D. Luengo, M. F. Bugallo
Comments: 30 Pages.

Population Monte Carlo (PMC) sampling methods are powerful tools for approximating distributions of static unknowns given a set of observations. These methods are iterative in nature: at each step they generate samples from a proposal distribution and assign them weights according to the importance sampling principle. Critical issues in applying PMC methods are the choice of the generating functions for the samples and the avoidance of the sample degeneracy. In this paper, we propose three new schemes that considerably improve the performance of the original PMC formulation by allowing for better exploration of the space of unknowns and by selecting more adequately the surviving samples. A theoretical analysis is performed, proving the superiority of the novel schemes in terms of variance of the associated estimators and preservation of the sample diversity. Furthermore, we show that they outperform other state of the art algorithms (both in terms of mean square error and robustness w.r.t. initialization) through extensive numerical simulations.
Category: Statistics

[4] viXra:1601.0167 [pdf] submitted on 2016-01-16 03:40:15

Causation and the Law of Independence.

Authors: Ilija Barukčić
Comments: Pages.

Titans like Bertrand Russell or Karl Pearson warned us to keep our mathematical and statistical hands off causality and at the end David Hume too. Hume's scepticism has dominated discussion of causality in both analytic philosophy and statistical analysis for a long time. But more and more researchers are working hard on this field and trying to get rid of this positions. In so far, much of the recent philosophical or mathematical writing on causation (Ellery Eells (1991), Daniel Hausman (1998), Pearl (2000), Peter Spirtes, Clark Glymour and Richard Scheines (2000), ...) either addresses to Bayes networks, to the counterfactual approach to causality developed in detail by David Lewis, to Reichenbach's Principle of the Common Cause or to the Causal Markov Condition. None of this approaches to causation investigated the relationship between causation and the law of independence to a necessary extent. Nonetheless, the relationship between causation and the law of independence, one of the fundamental concepts in probability theory, is very important. May an effect occur in the absence of a cause? May an effect fail to occur in the presence of a cause? In so far, what does constitute the causal relation? On the other hand, if it is unclear what does constitute the causal relation, maybe we can answer the question, what does not constitute the causal relation. So far, a cause as such can not be independent from its effect and vice versa, if there is a deterministic causal relationship. This publication will prove, that the law of independence defines causation to some extent ex negativo.
Category: Statistics

[3] viXra:1601.0070 [pdf] submitted on 2016-01-07 16:41:10

Attraction Structure and the Speed of Convergence

Authors: J.Tiago de Oliveira
Comments: 37 Pages.

Statistical Analysis of Extremes chapter 3
Category: Statistics

[2] viXra:1601.0069 [pdf] submitted on 2016-01-07 16:42:58

A Quick Exploration of Extreme Data

Authors: J.Tiago de Oliveira
Comments: 11 Pages.

Statistical Analysis of Extremes chapter 4
Category: Statistics

[1] viXra:1601.0032 [pdf] submitted on 2016-01-05 10:37:48

Importance of Circular Data in Sports Science – a Review

Authors: M. Srinivas, S. Sambasiva Rao
Comments: 7 Pages. This paper has been published in Indian Journal of Physical Education and Allied Sciences, ISSN: 2395-6895, Vol.1, No.5, pp.37-44.

The statistical analysis of angular data is typically encountered in biological and geological studies, among several other areas of research. Circular data is the simplest case of this category of data called directional data, where the single response is not scalar, but angular or directional. A statistical analysis pertaining to two dimensional directional data is generally referred to as “Circular Statistics”. In this paper, an attempt is made to review various fundamental concepts of circular statistics and to discuss its applicability in sports science.
Category: Statistics