Statistics

Previous months:
2010 - 1003(10) - 1004(7) - 1005(4) - 1006(1) - 1007(2) - 1008(4) - 1010(1) - 1011(1)
2011 - 1105(2) - 1107(1) - 1111(1) - 1112(1)
2012 - 1203(1) - 1204(2) - 1205(1) - 1208(1) - 1210(1) - 1211(6) - 1212(1)
2013 - 1301(2) - 1304(3) - 1306(2) - 1307(1) - 1310(2)
2014 - 1402(1) - 1403(3) - 1404(2)

Recent Submissions

Any replacements are listed further down

[65] viXra:1404.0124 [pdf] submitted on 2014-04-14 20:12:53

The Koester Equation

Authors: Stefan Koester
Comments: 6 Pages.

The Koester Equation, and all of its processes, quantify the "loss in progress" experienced in a data set when it undergoes an abnormality, such as a missing day in testing. This loss in progress can also be viewed as a number determining by how much that data set is skewed by an abnormality. For example, if a person were to take three of the same tests for three days in a row, an obvious positive curve in their results would be apparent. If, on the fourth day, a break was taken and no testing occurred, the results after would not be the same as if the person had just continued. This is usually known as the loss in progress, and can now be quantified using The Koester Equation.
Category: Statistics

[64] viXra:1404.0082 [pdf] submitted on 2014-04-10 20:59:23

Introduction to Neutrosophic Statistics

Authors: Florentin Smarandache
Comments: 123 Pages.

Neutrosophic Statistics means statistical analysis of population or sample that has indeterminate (imprecise, ambiguous, vague, incomplete, unknown) data. For example, the population or sample size might not be exactly determinate because of some individuals that partially belong to the population or sample, and partially they do not belong, or individuals whose appurtenance is completely unknown. Also, there are population or sample individuals whose data could be indeterminate. In this book, we develop the 1995 notion of neutrosophic statistics. We present various practical examples. It is possible to define the neutrosophic statistics in many ways, because there are various types of indeterminacies, depending on the problem to solve.
Category: Statistics

[63] viXra:1403.0975 [pdf] submitted on 2014-03-31 11:13:23

The Efficient Use of Supplementary Information in Finite Population Sampling

Authors: editors Rajesh Singh, Florentin Smarandache
Comments: 71 Pages.

The purpose of writing this book is to suggest some improved estimators using auxiliary information in sampling schemes like simple random sampling, systematic sampling and stratified random sampling. This volume is a collection of five papers, written by nine co-authors (listed in the order of the papers): Rajesh Singh, Mukesh Kumar, Manoj Kr. Chaudhary, Cem Kadilar, Prayas Sharma, Florentin Smarandache, Anil Prajapati, Hemant Verma, and Viplav Kr. Singh. In first paper dual to ratio-cum-product estimator is suggested and its properties are studied. In second paper an exponential ratio-product type estimator in stratified random sampling is proposed and its properties are studied under second order approximation. In third paper some estimators are proposed in two-phase sampling and their properties are studied in the presence of non-response. In fourth chapter a family of median based estimator is proposed in simple random sampling. In fifth paper some difference type estimators are suggested in simple random sampling and stratified random sampling and their properties are studied in presence of measurement error.
Category: Statistics

[62] viXra:1403.0948 [pdf] submitted on 2014-03-27 12:42:36

Why the Dimensionless Mathematical Ratio pi Occurs in the Gauss Distribution Law

Authors: Nigel B. Cook
Comments: 1 Page.

The occurrence of pi in formulae apparently unrelated to geometry was used by Eugene Wigner in his 1960 paper The unreasonable effectiveness of mathematics in the natural sciences. Wigner's example is the Gaussian/normal distribution law, which is an example of obfuscation. Laplace (1782), Gauss (1809), Maxwell (1860) and Fisher (1915) wrote the normal exponential distribution with the square root of pi in the normalization outside the integral. But Stigler in 1982 rewrote the equation with pi in the exponent, making the formula look less mysterious because the exponent is then the area of a circle (in other words, Poisson's exponential distribution, adapted to circular areas, with areas expressed in dimensionless form); if you think of the use of the normal distribution to model CEP error probabilities for missiles landing around a target point. (Please see paper for equations.)
Category: Statistics

[61] viXra:1403.0075 [pdf] submitted on 2014-03-11 11:30:25

A Test of Financial Time-Series Data to Discriminate Among Lognormal, Gaussian and Square-Root Random Walks

Authors: Yuri Heymann
Comments: 6 Pages.

In the present study, Monte Carlo simulations show how a simple test applied to financial time- series data can discriminate among the lognormal random walk used in the Black-Scholes-Merton model, the Gaussian random walk used in the Ornstein-Uhlenbeck stochastic process, and the square-root random walk used in the Cox, Ingersoll and Ross process. Alpha-level hypothesis testing is provided. As a conclusion, this test appears to be helpful for selecting the best stochastic processes for pricing contingent claims and risk management.
Category: Statistics

[60] viXra:1402.0127 [pdf] submitted on 2014-02-19 09:38:22

Life Tables for the Old-age and the Disability Pensioners' in 2008 How Long Will They Receive Their Pensions in Hungary?

Authors: Maria Hablicsekne Richter
Comments: 13 Pages. Comments are welcome

One of the key issues in our lives: How long will we live? Other of the key issues in our lives: How long will we get or enjoy our pensions? In this analysis I focus on the mortality of beneficiaries in receipt of old-age pensions and disability pensions in Hungary. My main objective is to demonstrate that the mortality of beneficiaries receiving different types of benefits may be significantly different from the mortality of the population. On the basis of the life tables presented I show the graduated probability of death corresponding to different ages, benefits and genders and also the expected number of future years at the given ages. Considering all these, I make comparison between the mortality of beneficiaries receiving different types of benefits and the mortality of the population.
Category: Statistics

[59] viXra:1310.0183 [pdf] submitted on 2013-10-21 12:06:10

Deterministic Imputation in Multienvironment Trials

Authors: Sergio Arciniegas-Alarcón, Marisol García-Peña, Wojtek Janusz Krzanowski, Carlos Tadeu dos Santos Dias
Comments: 17 Pages.

This paper proposes five new imputation methods for unbalanced experiments with genotype by-environment interaction (). The methods use cross-validation by eigenvector, based on an iterative scheme with the singular value decomposition (SVD) of a matrix. To test the methods, we performed a simulation study using three complete matrices of real data, obtained from interaction trials of peas, cotton, and beans, and introducing lack of balance by randomly deleting in turn 10%, 20%, and 40% of the values in each matrix. The quality of the imputations was evaluated with the additive main effects and multiplicative interaction model (AMMI), using the root mean squared predictive difference (RMSPD) between the genotypes and environmental parameters of the original data set and the set completed by imputation. The proposed methodology does not make any distributional or structural assumptions and does not have any restrictions regarding the pattern or mechanism of missing values.
Category: Statistics

[58] viXra:1310.0024 [pdf] submitted on 2013-10-05 03:35:05

Theoretical Ecology in Mathematics

Authors: Nehul Yadav
Comments: 8 Pages. none

This research focuses primarily on the statistics and the famous models of mathematics used in ecology and evolution. I chose a unique topic in applied mathematics as i covet to become a mathematics researcher. Hope you like this research.
Category: Statistics

[57] viXra:1307.0123 [pdf] submitted on 2013-07-23 18:56:39

On Improvement in Estimating Population Parameter(s) Using Auxiliary Information

Authors: editors Rajesh Singh, Florentin Smarandache
Comments: 64 Pages.

The purpose of writing this book is to suggest some improved estimators using auxiliary information in sampling schemes like simple random sampling and systematic sampling. This volume is a collection of five papers, written by eight coauthors (listed in the order of the papers): Manoj K. Chaudhary, Sachin Malik, Rajesh Singh, Florentin Smarandache, Hemant Verma, Prayas Sharma, Olufadi Yunusa, and Viplav Kumar Singh, from India, Nigeria, and USA. The following problems have been discussed in the book: In chapter one an estimator in systematic sampling using auxiliary information is studied in the presence of non-response. In second chapter some improved estimators are suggested using auxiliary information. In third chapter some improved ratio-type estimators are suggested and their properties are studied under second order of approximation. In chapter four and five some estimators are proposed for estimating unknown population parameter(s) and their properties are studied. This book will be helpful for the researchers and students who are working in the field of finite population estimation.
Category: Statistics

[56] viXra:1306.0064 [pdf] submitted on 2013-06-10 23:24:51

A General Family of Dual to Ratio-Cum-Product Estimator in Sample Surveys

Authors: Rajesh Singh, Mukesh Kumar, Pankaj Chauhan, Nirmala Sawan, Florentin Smarandache
Comments: 8 Pages.

This paper presents a family of dual to ratio-cum-product estimators for the finite population mean. Under simple random sampling without replacement (SRSWOR) scheme, expressions of the bias and mean-squared error (MSE) up to the first order of approximation are derived. We show that the proposed family is more efficient than usual unbiased estimator, ratio estimator, product estimator, Singh estimator (1967), Srivenkataramana (1980) and Bandyopadhyaya estimator (1980) and Singh et al. (2005) estimator. An empirical study is carried out to illustrate the performance of the constructed estimator over others.
Category: Statistics

[55] viXra:1306.0021 [pdf] submitted on 2013-06-05 07:25:19

One More Formula for the Variance

Authors: Sabiou Inoua
Comments: 2 Pages.

This short paper establishes one more formula for the variance. Consider a random variable X whose possible values are x1, …, xn with probabilities p1, …, pn of occurring, respectively. Pick two of these possible values successively (each xi having the probability pi of being chosen). Compute the difference between the two chosen values. Square the difference. Claim: you are expected to get (twice) the variance of X. This formula makes the variance appear an even more natural measure of dispersion than usually thought.


Category: Statistics

[54] viXra:1304.0143 [pdf] submitted on 2013-04-25 11:24:37

A Note on Montort's Probability Expressions in the Problem of Rational Division of Stakes

Authors: Zhang Huiming
Comments: 6 Pages. In Chinese

In this paper, by using three kinds of ideas of probability theory, we proof the equivalence among three kinds of probability expressions in the problem of rational division of stakes by the method of mathematical analysis. In addition, different ideas of probability theory obtain the identity. Let one of the probability expressions be a function, we find the B-function is closely relate to the derivative the probability expression function. According to Beta distribution function, we proof that probability expression function in the problem of rational division is equal to the distribution function of Beta distribution.
Category: Statistics

[53] viXra:1304.0055 [pdf] submitted on 2013-04-11 17:24:10

Efficient Statistical Significance Approximation for Local Association Analysis of High-Throughput Time Series Data

Authors: Li Charlie Xia
Comments: 56 Pages.

Local association analysis, such as local similarity analysis and local shape analysis, of biological time series data helps elucidate the varying dynamics of biological systems. However, their applications to large scale high-throughput data are limited by slow permutation procedures for statistical signicance evaluation. We developed a theoretical approach to approximate the statistical signicance of local similarity and local shape analysis based on the approximate tail distribution of the maximum partial sum of independent identically distributed (i.i.d) and Markovian random variables. Simulations show that the derived formula approximates the tail distribution reasonably well (starting at time points > 10 with no delay and > 20 with delay) and provides p-values comparable to those from permutations. The new approach enables ecient calculation of statistical signicance for pairwise local association analysis, making possible all-to-all association studies otherwise prohibitive. As a demonstration, local association analysis of human microbiome time series shows that core OTUs are highly synergetic and some of the associations are body-site specic across samples. The new approach is implemented in our eLSA package, which now provides pipelines for faster local similarity and shape analysis of time series data. The tool is freely available from eLSA's website: http://meta.usc.edu/softs/lsa.
Category: Statistics

[52] viXra:1304.0054 [pdf] submitted on 2013-04-11 17:27:00

Developing Statistical and Algorithmic Methods for Shotgun Metagenomics and Time Series Analysis

Authors: Li Charlie Xia
Comments: 127 Pages.

Recent developments in experimental molecular techniques, such as microarray, next generation sequencing technologies, have led molecular biology into a high-throughput era with emergent omics research areas, including metagenomics and transcriptomics. Massive-size omics datasets generated and being generated from the experimental laboratories put new challenges to computational biologists to develop fast and accurate quantitative analysis tools. We have developed two statistical and algorithmic methods, GRAMMy and eLSA, for metagenomics and microbial community time series analysis. GRAMMy provides a unied probabilistic framework for shotgun metagenomics, in which maximum likelihood method is employed to accurately compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). We extended the Local Similarity Analysis technique (eLSA) to time series data with replicates, capturing statistically signicant local and potentially time-delayed associations. Both methods are validated through simulation studies and their capability to reveal new biology is also demonstrated through applications to real datasets. We implemented GRAMMy and eLSA as C++ extensions to Python, with both superior computational eciency and easy-to-integrate programming interfaces. GRAMMy and eLSA methods will be increasingly useful tools as new omics researches accelerating their pace.http://meta.usc.edu/softs/lsa.
Category: Statistics

[51] viXra:1301.0113 [pdf] submitted on 2013-01-18 18:27:46

Data Imputation in Trials with Genotype by Environment Interaction: an Application on Cotton Data (In Portuguese)

Authors: Sergio Arciniegas-Alarcón, Carlos Tadeu dos Santos Dias
Comments: 14 Pages. In portuguese

A common problem in multienvironment trials are the missing genotype-environmental combinations. Recently, Bergamo proposed a distribution-free multiple imputation method in the interaction matrix. The purpose of this paper is to evaluate the new development and compare it with methodologies that have success in the genotype-environmental trials with missing data, like the alternating least squares (ALS) and the robust estimates, using the Additive Main effects and Multiplicative Interaction Models (AMMI). Was made an simulation study based in real data, doing missed random considering different percentages, imputing the observations and comparing the methodologies through three criteria: the square root of the mean predictive difference, the Procrustes statistic and the Spearman's rank correlation coeficient. Was concluded that the multiple imputation is not better than the imputation based in a additive model without interaction, and the best results for the variance are obtained with robust sub-models. All the considerated methods in this study have a high correlation between the true and the imputed missing values.
Category: Statistics

[50] viXra:1301.0031 [pdf] submitted on 2013-01-06 05:54:47

On the Convergence of the Metropolis-Hastings Markov Chains

Authors: Dimiter Tsvetkov, Lyubomir Hristov, Ralitsa Angelova-Slavova
Comments: 14 Pages.

In this paper we consider Metropolis-Hastings Markov chains with absolutely continuous with respect to Lebesgue measure target and proposal distributions. We show that under some very general conditions the sequence of the powers of the conjugate transition operator has a strong limit in a properly defined Hilbert space described for example in Stroock (2005). Then we propose conditions under which the sequence of the successive densities of such a chain converges to the target density according to the total variation distance for any choice of the initial density. In particular we prove that the positiveness of the target and the proposal densities is enough for the chain to converge.
Category: Statistics

[49] viXra:1212.0008 [pdf] submitted on 2012-12-02 07:12:32

Matrix Transformation and Transform the Generalized Wave Equation into the Maxwell Wave Equation

Authors: Xianzhao Zhong
Comments: 10 Pages.

For free electromagnetic field, there are two kinds of the wave equation, one is Maxwell wave equation, another is generalized wave equation. In the paper, according to the matrix transformation the author transform the general quadratic form into diagonal matrix. Then this can obtain both forms of wave equation. One is the Maxwell wave equation, another is the second form of the wave equation. In the half latter of the paper the author establish other two vibrator differential equations.
Category: Statistics

[48] viXra:1211.0132 [pdf] submitted on 2012-11-22 08:37:01

Data Imputation in Trials with Genotype×environment Interaction (In Portuguese)

Authors: Sergio Arciniegas-Alarcón, Marisol García-Peña, Carlos Tadeu dos Santos Dias
Comments: 7 Pages. Paper in portuguese

The aim of this work was the study of prediction errors associated with four imputation methods applied to solve the problem of unbalance in experiments with genotype×environment (G×E) interaction. A simulation study was carried out based on four complete matrices of real data obtained in trials of interaction G×E of pea, cotton, beans and eucalyptus, respectively. The simulation of unbalance was done with random withdrawal of 10, 20 and 40% in each matrix. The prediction errors were found using cross-validation and were tested in classic intervals of 95% for missing data. For data imputation, algorithms were considered using models of additive effects without interaction and model estimates of additive effects with multiplicative interaction based on robust submodels. In general, the best prediction errors were obtained after imputation through an additive model without interaction.
Category: Statistics

[47] viXra:1211.0131 [pdf] submitted on 2012-11-22 08:15:53

Ammi Analysis with Imputed Data in Genotype X Environment Interaction Experiments in Cotton (In Portuguese)

Authors: Sergio Arciniegas-Alarcón;, Carlos Tadeu dos Santos Dias
Comments: 7 Pages. Paper in portuguese

The objective of this work was to evaluate the convenience of defining the number of multiplicative components of additive main effect and multiplicative interaction models (AMMI) in genotype x enviroment interaction experiments in cotton with imputed or unbalanced data. A simulation study was carried out based on a matrix of real seed-cotton productivity data obtained in trials with genotype x environment interaction carried out with 15 genotypes at 27 locations in Brazil. The simulation was made with random withdrawals of 10, 20 and 30% of the data. The optimal number of multiplicative components for the AMMI model was determined using the Cornelius test and the likelihood ratio test onto the matrix completed by imputation. A correction based on the data missing in the Cornelius procedure was proposed for testing the hypothesis when the analysis is made from averages and the repetitions are not available. For data imputation, the methods considered used robust submodels, alternating least squares and multiple imputation. For analysis of unbalanced experiments, it is advisable to choose the number of multiplicative components of the AMMI model only from the observed information and to make the classical estimation of parameters based on the matrices completed by imputation.
Category: Statistics

[46] viXra:1211.0129 [pdf] submitted on 2012-11-21 13:18:47

Duality in Robust Dynamic Programming

Authors: Shyam S Chandramouli
Comments: 10 Pages.

Many decision making problems that arise in Finance, Economics, Inventory etc. can be formulated as Markov Decision Problems (MDPs) and solved using Dynamic Programming techniques. Further, to mitigate the statistical errors in estimating the underlying transition matrix or to exercise optimal control under adverserial setup led to the study of robust formulations of the same problems in Ghaoui and Nilim~\cite{ghaoui} and Iyengar~\cite{garud}. In this work, we study the computational methodologies to develop and validate feasible control policies for the Robust Dynamic Programming Problem. In terms of developing control policies, the current work can be seen as generalizing the existing literature on Approximate Dynamic Programming (ADP) to its robust counterpart. The work also generalizes the Information Relaxation and Dual approach of Brown, Smith and Sun~\cite{bss} to robust multi period problems. While discussing this framework we approach it both from a discrete control perspective and also as a set of conditional continous measures as in Ghaoui and Nilim~\cite{ghaoui} and Iyengar~\cite{garud}. We show numerical experiments on applications like ... In a nutshell, we expand the gamut of problems that the dual approach can handle in terms of developing tight bounds on the value function.
Category: Statistics

[45] viXra:1211.0127 [pdf] submitted on 2012-11-21 10:29:40

A Convex Optimization Approach to Multiple Stopping

Authors: Shyam S Chandramouli
Comments: 22 Pages.

In this current work, we generalize the recent Pathwise Optimization approach of Desai et al.~\cite{desai2010pathwise} to Multiple stopping problems. The approach also minimizes the dual bound as in Desai et al.~\cite{desai2010pathwise} to find the best approximation architecture for the Multiple stopping problem. Though, we establish the convexity of the dual operator, in this setting as well, we cannot directly take advantage of this property because of the computational issues that arise due to the combinatorial nature of the problem. Hence, we deviate from the pure martingale dual approach to \emph{marginal} dual approach of Meinshausen and Hambly~\cite{meinshausenhambly2004} and solve each such optimal stopping problem in the framework of Desai et al.~\cite{desai2010pathwise}. Though, this Pathwise Optimization approach as generalized to the Multiple stopping problem is computationally intensive, we highlight that it can produce superior dual and primal bounds in certain settings.
Category: Statistics

[44] viXra:1211.0113 [pdf] submitted on 2012-11-19 13:56:24

Maximum Likelihood Estimation of the Negative Binomial Distribution

Authors: Stephen Crowley
Comments: 2 Pages.

Maximum likelihood estimation of the negative binomial distribution via numerical methods is discussed.
Category: Statistics

[43] viXra:1211.0094 [pdf] submitted on 2012-11-16 15:47:51

Point Process Models for Multivariate High-Frequency Irregularly Spaced Data

Authors: Stephen Crowley
Comments: 6 Pages.

Definitions from the theory of point processes are recalled. Models of intensity function paramaterization and maximum likelihood estimation from data are explored. Closed-form log-likelihood expressions are given for the Hawkes process, Autoregressive Conditional Duration(ACD), and Log-ACD models. The Autoregressive Conditional Intensity model is also discussed.
Category: Statistics

[42] viXra:1210.0065 [pdf] submitted on 2012-10-12 11:13:21

An Alternative Methodology for Imputing Missing Data in Trials with Genotype-by-Environment Interaction

Authors: Sergio Arciniegas-Alarcón, Marisol García-Peña, Carlos Tadeu dos Santos Dias, Wojtek Janusz Krzanowski
Comments: 14 Pages.

A common problem in multi-environment trials arises when some genotype-by-environment combinations are missing. The aim of this paper is to propose a new deterministic imputation algorithm using a modification of the Gabriel cross-validation method. The method involves the singular value decomposition (SVD) of a matrix and was tested using three alternative component choices of the SVD in simulations based on two complete sets of real data, with values deleted randomly at different rates. The quality of the imputations was evaluated using the correlations and the mean square deviations between these estimates and the true observed values. The proposed methodology does not make any distributional or structural assumptions and does not have any restrictions regarding the pattern or mechanism of the missing data.
Category: Statistics

[41] viXra:1208.0053 [pdf] submitted on 2012-08-12 10:38:58

On an Application of Bayesian Estimation

Authors: Kiyoharu Tanaka, Evgeniy Grechnikov
Comments: 5 Pages.

This paper explains the Bayesian version of estimation as a method for calculating credibility premium or credibility number of claims for short-term insurance contracts using two ingredients: past data on the risk itself and collateral data from other sources considered to be relevant. The Poisson/gamma model to estimate the claim frequency for portfolio of policies and Normal/normal model to estimate the pure premium are explained and applied.
Category: Statistics

[40] viXra:1205.0104 [pdf] submitted on 2012-05-28 01:49:47

Data Mining Career Batting Performances in Baseball

Authors: David D. Tung
Comments: 30 Pages.

In this paper, we use statistical data mining techniques to analyze a multivariate data set of career batting performances in Major League Baseball. Principal components analysis (PCA) is used to transform the high-dimensional data to its lower-dimensional principal components, which retain a high percentage of the sample variation, hence reducing the dimensionality of the data. From PCA, we determine a few important key factors of classical and sabermetric batting statistics, and the most important of these is a new measure, which we call Offensive Player Grade (OPG), that efficiently summarizes a player’s offensive performance on a numerical scale. The determination of these lower-dimensional principal components allows for accessible visualization of the data, and for segmentation of players into groups using clustering, which is done here using the K-means clustering algorithm. We provide illuminating visual displays from our statistical data mining procedures, and we also furnish a player listing of the top 100 OPG scores which should be of interest to those that follow baseball.
Category: Statistics

[39] viXra:1205.0093 [pdf] submitted on 2012-05-24 02:48:37

Mathematical Analysis of the Problems faced by the People With Disabilities (PWDs) (With Specific Reference to Tamil Nadu in India)

Authors: W. B. Vasantha Kandasamy, Florentin Smarandache, A. Praveen Prakash
Comments: 165 Pages.

The authors in this book have analyzed the socio-economic and psychological problems faced by People with Disabilities (PWDs) and their families. The study was made by collecting data using both fuzzy linguistic questionnaire / by interviews in case they are not literates from 2,15,811 lakhs people. This data was collected using the five Non Government Organizations (NGOs) from northern Tamil Nadu.
Category: Statistics

[38] viXra:1204.0100 [pdf] submitted on 2012-04-28 18:43:37

Investigating “overclocking” an Android Mobile Phone with a Designed Experiment

Authors: Glen Gilchrist
Comments: 10 Pages.

Obtaining additional computational speed from a central processing unit by means of driving a computer system at clock frequencies higher than the default settings has long been used as a method to inexpensively “boost” the performance of a computer. With the emergence of so called smart-phones and the openness of the Android operating system, such tweaks have recently been applied to mobile handsets. This paper investigates the performance gains to an off the shelf handset with the custom Skatie Rom (C3C0, 2012) via means of a statistically designed experiment. A Taguchi Orthogonal Array was used to investigate 5 factors, each at 3 levels on the performance as measured with Aurora Softworks Quadrant application. Unsurprisingly the core CPU speed had the largest effect on overall performance, but we demonstrate that CPU Governor (lagfree) and the I/O Scheduler (noop) were also significant at p=0.000, whilst the size of the SD Card Cache is significant to p=0.065.
Category: Statistics

[37] viXra:1204.0077 [pdf] submitted on 2012-04-18 09:37:00

Multivariate Ratio Estimation With Known Population Proportion Of Two Auxiliary Characters For Finite Population

Authors: Rajesh Singh, Sachin Malik, A. A. Adewara, Florentin Smarandache
Comments: 8 Pages.

In the present study, we propose estimators based on geometric and harmonic mean for estimating population mean using information on two auxiliary attributes in simple random sampling. We have shown that, when we have multi-auxiliary attributes, estimators based on geometric mean and harmonic mean are less biased than Olkin (1958), Naik and Gupta (1996) and Singh (1967) type- estimator under certain conditions. However, the MSE of Olkin( 1958) estimator and geometric and harmonic estimators are same up to the first order of approximation.
Category: Statistics

[36] viXra:1203.0081 [pdf] submitted on 2012-03-21 19:54:13

A Critique of Gy’s Sampling Theory

Authors: D.S. Dihalu, B. Geelhoed
Comments: 7 Pages.

One of the most used theories for the sampling of materials for physical, chemical or biological testing is the theory developed by Pierre Gy. After a number of scientific publications, including several in the French language (e.g. Gy, 1953, 1964, 1975), he made –in 1979– his entire new theory available to the worldwide sampling community in a book (Gy, 1979) written in English. This book contains a complete description of Gy’s sampling theory. Later, Gy has made several refinements, but the essential character of the theory has always remained the same as the theory described in his 1979 book. The impact of this book (and the entire theory of Gy) has been significant; even nowadays this book is regarded as the number one source of sampling-related information for engineers and process operators. Even though the practical impact and the scientific value of this work are unquestionably strong, several critical points of discussion need to be mentioned here, because the development of new technologies, recent experimental results and novel insights show that parts of Gy’s theory need to be updated or revised.
Category: Statistics

[35] viXra:1112.0092 [pdf] submitted on 2011-12-31 06:41:31

Gauss Distribution-a Complete Proof Only Ita

Authors: Leonardo Rubino
Comments: 13 Pages.

In this paper (for the moment in Italian language only-sorry) you can find a rare proof of the Gauss Distribution Law, as almost all available books (in the opinion of who is writing) just show it, without giving any demonstration. Then, here you can also find a description of the 3-sigma rule, very used in the field of technologies. At last, a peculiarity of the quantum mechanics is here highlighted, where the claim of the equality sign in the Heisenberg indetermination equations leads to a Gaussian wave function, indeed, which reduces to a minimum value the quantum uncertainty situation.
Category: Statistics

[34] viXra:1111.0073 [pdf] submitted on 21 Nov 2011

Kullback-Leibler Simplex

Authors: Popon Kangpenkae
Comments: 12 pages

This technical reference presents the functional structure and the algorithmic implementation of KL (Kullback-Leibler) simplex. It details the simplex approximation and fusion. The KL simplex is fundamental, robust, adaptive an informatics agent for computational research in economics, finance, game and mechanism. From this perspective the study provides comprehensive results to facilitate future work in such areas.
Category: Statistics

[33] viXra:1107.0049 [pdf] submitted on 24 Jul 2011

Studies in Sampling Techniques and Time Series Analysis

Authors: Rajesh Singh, Florentin Smarandache
Comments: 72 pages

This book has been designed for students and researchers who are working in the field of time series analysis and estimation in finite population. There are papers by Rajesh Singh, Florentin Smarandache, Shweta Maurya, Ashish K. Singh, Manoj Kr. Chaudhary, V. K. Singh, Mukesh Kumar and Sachin Malik. First chapter deals with the problem of time series and the rest of four chapters deal with the problems in estimation in finite population.
Category: Statistics

[32] viXra:1105.0038 [pdf] submitted on 25 May 2011

The Numerical Generalised Least-Squares Estimator of an Unknown Constant Mean of Randon Field

Authors: T Suslo
Comments: 9 pages.

We constraint on computer the best linear unbiased generalized statistics of random field for the best linear unbiased generalized statistics of an unknown constant mean of random field and derive the numerical generalized least-squares estimator of an unknown constant mean of random field. We derive the third constraint of spatial ststistics and show that the classic generalized least-squares estimator of an unknown constant mean of the field is only an asymptotic disjunction of the numerical one.
Category: Statistics

[31] viXra:1105.0026 [pdf] submitted on 16 May 2011

Asymptotic Moments of Near Neighbor Distances for the Gaussian Distribution

Authors: Elia Liitiäinen
Comments: 37 pages, Submitted to a journal

We study the moments E[d1,kα] of the k-th nearest neighbor distance for independent identically distributed points in Rn. In the earlier literature, the case α > n has been analyzed by assuming a bounded support for the underlying density. The boundedness assumption is removed by assuming the multivariate Gaussian distribution. In this case, the nearest neighbor distances show very different behavior in comparison to earlier results. In the unbounded case, it is shown that E[d1,kα] is asymptotically proportional to M-1 logn-1-α/2M instead of M-α/n as in the previous literature.
Category: Statistics

[30] viXra:1011.0070 [pdf] submitted on 29 Nov 2010

Determinants of Population Growth in Rajasthan: An Analysis

Authors: V.V. Singh, Alka Mittal, Neetish Sharma, Florentin Smarandache
Comments: 12 pages

Rajasthan is the biggest State of India and is currently in the second phase of demographic transition and is moving towards the third phase of demographic transition with very slow pace. However, state's population will continue to grow for a time period. Rajasthan's performance in the social and economic sector has been poor in past. The poor performance is the outcome of poverty, illiteracy and poor development, which co-exist and reinforce each other. There are many demographic and socio-economic factors responsible for population growth. This paper attempts to identify the demographic and socio-economic variables, which are responsible for population growth in Rajasthan with the help of multivariate analysis.
Category: Statistics

[29] viXra:1010.0054 [pdf] submitted on 20 Mar 2010

A Class Of Separate-Type Estimators For Population Mean In Stratified Sampling Using Known Parameters Under Non-Response

Authors: Manoj K. Chaudhary, Rajesh Singh, Mukesh Kumar, Rakesh K. Shukla, Florentin Smarandache
Comments: 11 pages

The objective of the present paper is to propose a family of separate-type estimators of population mean in stratified random sampling in presence of nonresponse based on the family of estimators proposed by Khoshnevisan et al. (2007). Under simple random sampling without replacement (SRSWOR) the expressions of bias and mean square error (MSE) up to the first order of approximation are derived. The comparative study of the family with respect to usual estimator has been discussed. The expressions for optimum sample sizes of the strata in respect to cost of the survey have also been derived. An empirical study is carried out to shoe the properties of the estimators.
Category: Statistics

[28] viXra:1008.0044 [pdf] submitted on 16 Aug 2010

Degrees of Freedom: A Correction to Chi Square For Physical Hypotheses

Authors: John Michael Williams
Comments: 47 Pages.

In common practice, degrees of freedom (df) may be corrected for the number of theoretical free parameters as though parameters were the same as data categories. However, a free physical parameter generally is not equivalent to a data category in terms of goodness of the fit. Here we use synthetic, nonrandom data to show the effect of choice of categorization and df on goodness of fit. We then explain the origin of the df problem and show how to avoid it in a three-step process: First, the theoretical curve is fit to the data to remove its variance, leaving what, under the null hypothesis, should be structureless residuals. Second, the residuals are fit by a set of orthogonal polynomials up to the degree, should it occur, at which significant variance was removed. Third, the number of nonsignificant polynomial terms in the original + orthogonal set become the df in a standard chi square test. This process reduces a general df problem to one of polynomial df and allows goodness of a fit to be determined by data categorization and significance level alone. An example is given of an evaluation of physical data on neutrino oscillation.
Category: Statistics

[27] viXra:1008.0034 [pdf] submitted on 11 Aug 2010

Rural Migration A Significant Cause Of Urbanization: A District Level Review Of Census Data For Rajasthan

Authors: Jayant Singh, Hansraj Yadav, Florentin Smarandache
Comments: 8 pages

Migration plays an important role in urbanization of a state. In general more the migration higher the urbanization rate though it many not necessarily true in all the situations but in general it is witnessed that migration have a fairly large share in urbanization. A district level analysis for Rajasthan state is attempted to comprehend Urbanization due to migration their interlinkages and association.
Category: Statistics

[26] viXra:1008.0033 [pdf] submitted on 11 Aug 2010

Urbanization Due To Migration: A District Level Analysis Of Migrants From Different Distances For The Rajasthan State

Authors: Jayant Singh, Hansraj Yadav, Florentin Smarandache
Comments: 10 pages

People migrate to different distances and there migration is governed by different reasons. Distance of place of migration plays an important role in the migration process and an analysis based on the remoteness of the origin and destination will reveal the push and pull factors in more explicit way. However, a common phenomenon is that people do migrate to a longer distance with a more focused objective and there propensity to settle in urban areas is always higher than the small distance migration.
Category: Statistics

[25] viXra:1008.0020 [pdf] submitted on 7 Aug 2010

A Family of Estimators for Estimating Population Mean in Stratified Sampling under Non-Response

Authors: Manoj K. Chaudhary, Rajesh Singh, Rakesh K. Shukla, Mukesh Kumar, Florentin Smarandache
Comments: 8 pages

Khoshnevisan et al. (2007) proposed a general family of estimators for population mean using known value of some population parameters in simple random sampling. The objective of this paper is to propose a family of combined-type estimators in stratified random sampling adapting the family of estimators proposed by Khoshnevisan et al. (2007) under non-response. The properties of proposed family have been discussed. We have also obtained the expressions for optimum sample sizes of the strata in respect to cost of the survey. Results are also supported by numerical analysis.
Category: Statistics

[24] viXra:1007.0034 [pdf] submitted on 23 Jul 2010

On the Gini Mean Difference Arc-Lengths Test for Circular Data

Authors: David D. Tung, S. Rao Jammalamadaka
Comments: 15 pages.

In this paper, we propose a new test of uniformity on the circle based on the Gini mean difference of the sample arc-lengths, i.e. the gaps between successive observations on the circumference of the circle. These sample arc-lengths are analogous to sample spacings, which are the gaps between successive observations on the real line. Such a Gini mean difference test is analogous to Rao's spacings test, which has been used to test the uniformity of circular data. We obtain both the exact and asymptotic distributions of the Gini mean difference arc-lengths test, under the null hypothesis of circular uniformity. We also provide a table of upper percentile values of the exact distribution for small to moderate sample sizes. Some examples of circular data analysis are also considered. It is also seen that the Gini mean difference arc-lengths tests is more asymptotically efficient than Rao's test in the sense of Pitman asymptotic relative efficiency.
Category: Statistics

[23] viXra:1007.0016 [pdf] submitted on 13 Mar 2010

Studies in Statistical Inference, Sampling Techniques and Demography

Authors: Rajesh Singh, Jayant Singh, Florentin Smarandache
Comments: 64 pages

This volume is a collection of five papers. Two chapters deal with problems in statistical inference, two with inferences in finite population, and one deals with demographic problem. The ideas included here will be useful for researchers doing works in these fields.
Category: Statistics

[22] viXra:1006.0046 [pdf] submitted on 18 Jun 2010

U-Statistics Based on Spacings

Authors: David D. Tung, S. Rao Jammalamadaka
Comments: 23 pages.

In this paper, we investigate the asymptotic theory for U-statistics based on sample spacings, i.e. the gaps between successive observations. The usual asymptotic theory for U-statistics does not apply here because spacings are dependent variables. However, under the null hypothesis, the uniform spacings can be expressed as conditionally independent Exponential random variables. We exploit this idea to derive the relevant asymptotic theory both under the null hypothesis and under a sequence of close alternatives. The generalized Gini mean difference of the sample spacings is a prime example of a U-statistic of this type. We show that such a Gini spacings test is analogous to Rao's spacings test. We find the asymptotically locally most powerful test in this class, and it has the same efficacy as the Greenwood statistic.
Category: Statistics

[21] viXra:1005.0068 [pdf] submitted on 11 Mar 2010

Randomness and Optimal Estimation in Data Sampling

Authors: M. Khoshnevisan, S. Saxena, H. P. Singh, S. Singh, Florentin Smarandache
Comments: 63 pages.

The purpose of this book is to postulate some theories and test them numerically. Estimation is often a difficult task and it has wide application in social sciences and financial market. In order to obtain the optimum efficiency for some classes of estimators, we have devoted this book into three specialized sections.
Category: Statistics

[20] viXra:1005.0048 [pdf] submitted on 11 Mar 2010

Estimation of Mean in Presence of Non Response Using Exponential Estimator

Authors: Rajesh Singh, Mukesh Kumar, Manoj K. Chaudhary, Florentin Smarandache
Comments: 11 pages

This paper considers the problem of estimating the population mean using information on auxiliary variable in presence of non response. Exponential ratio and exponential product type estimators have been suggested and their properties are studied. An empirical study is carried out to support the theoretical results.
Category: Statistics

[19] viXra:1005.0020 [pdf] submitted on 8 May 2010

Confidence Intervals for the Pythagorean Formula in Baseball

Authors: David D. Tung
Comments: 27 Pages.

In this paper, we will investigate the problem of obtaining confidence intervals for a baseball team's Pythagorean expectation, i.e. their expected winning percentage and expected games won. We study this problem from two different perspectives. First, in the framework of regression models, we obtain confidence intervals for prediction, i.e. more formally, prediction intervals for a new observation, on the basis of historical binomial data for Major League Baseball teams from the 1901 through 2009 seasons, and apply this to the 2009 MLB regular season. We also obtain a Scheffé-type simultaneous prediction band and use it to tabulate predicted winning percentages and their prediction intervals, corresponding to a range of values for log(RS=RA). Second, parametric bootstrap simulation is introduced as a data-driven, computer-intensive approach to numerically computing confidence intervals for a team's expected winning percentage. Under the assumption that runs scored per game and runs allowed per game are random variables following independent Weibull distributions, we numerically calculate confidence intervals for the Pythagorean expectation via parametric bootstrap simulation on the basis of each team's runs scored per game and runs allowed per game from the 2009 MLB regular season. The interval estimates, from either framework, allow us to infer with better certainty as to which teams are performing above or below expectations. It is seen that the bootstrap confidence intervals appear to be better at detecting which teams are performing above or below expectations than the prediction intervals obtained in the regression framework.
Category: Statistics

[18] viXra:1005.0003 [pdf] submitted on 10 Mar 2010

N-Algebraic Structures and S-N-Algebraic Structures

Authors: W. B. Vasantha Kandasamy, Florentin Smarandache
Comments: 209 pages

In this book, for the first time we introduce the notions of Ngroups, N-semigroups, N-loops and N-groupoids. We also define a mixed N-algebraic structure. We expect the reader to be well versed in group theory and have at least basic knowledge about Smarandache groupoids, Smarandache loops, Smarandache semigroups and bialgebraic structures and Smarandache bialgebraic structures.
Category: Statistics

[17] viXra:1004.0076 [pdf] submitted on 8 Mar 2010

Improved Exponential Estimator for Population Variance Using Two Auxiliary Variables

Authors: Rajesh Singh, Pankaj Chauhan, Nirmala Sawan, Florentin Smarandache
Comments: 8 pages

In this paper exponential ratio and exponential product type estimators using two auxiliary variables are proposed for estimating unknown population variance Sy2. Problem is extended to the case of two-phase sampling. Theoretical results are supported by an empirical study.
Category: Statistics

[16] viXra:1004.0064 [pdf] submitted on 8 Mar 2010

Improvement in Estimating Population Mean using Two Auxiliary Variables in Two-Phase Sampling

Authors: Rajesh Singh, Pankaj Chauhan, Nirmala Sawan, Florentin Smarandache
Comments: 11 pages

This study proposes improved chain-ratio type estimator for estimating population mean using some known values of population parameter(s) of the second auxiliary character. The proposed estimators have been compared with two-phase ratio estimator and some other chain type estimators. The performances of the proposed estimators have been supposed with a numerical illustration.
Category: Statistics

[15] viXra:1004.0063 [pdf] submitted on 8 Mar 2010

Optimum Statistical Test Procedure

Authors: Rajesh Singh, Jayant Singh, Florentin Smarandache
Comments: 16 pages

Optimum Statistical Test Procedure
Category: Statistics

[14] viXra:1004.0062 [pdf] submitted on 8 Mar 2010

Ratio-Product Type Exponential Estimator For Estimating Finite Population Mean Using Information On Auxiliary Attribute

Authors: Rajesh Singh, Pankaj Chauhan, Nirmala Sawan, Florentin Smarandache
Comments: 15 pages

In practice, the information regarding the population proportion possessing certain attribute is easily available see Jhajj et.al. (2006). For estimating the population mean Y of the study variable y, following Bahl and Tuteja (1991), a ratio-product type exponential estimator has been proposed by using the known information of population proportion possessing an attribute (highly correlated with y) in simple random sampling. The expressions for the bias and the mean-squared error (MSE) of the estimator and its minimum value have been obtained. The proposed estimator has an improvement over mean per unit estimator, ratio and product type exponential estimators as well as Naik and Gupta (1996) estimators. The results have also been extended to the case of two phase sampling. The results obtained have been illustrated numerically by taking some empirical populations considered in the literature.
Category: Statistics

[13] viXra:1004.0061 [pdf] submitted on 8 Mar 2010

Almost Unbiased Exponential Estimator for the Finite Population Mean

Authors: Rajesh Singh, Pankaj Chauhan, Nirmala Sawan, Florentin Smarandache
Comments: 12 pages

In this paper we have proposed an almost unbiased ratio and product type exponential estimator for the finite population mean Y-bar. It has been shown that Bahl and Tuteja (1991) ratio and product type exponential estimators are particular members of the proposed estimator. Empirical study is carried to demonstrate the superiority of the proposed estimator.
Category: Statistics

[12] viXra:1004.0056 [pdf] submitted on 8 Mar 2010

Almost Unbiased Ratio and Product Type Estimator of Finite Population Variance Using the Knowledge of Kurtosis of an Auxiliary Variable in Sample Surveys

Authors: Rajesh Singh, Pankaj Chauhan, Nirmala Sawan, Florentin Smarandache
Comments: 11 pages

It is well recognized that the use of auxiliary information in sample survey design results in efficient estimators of population parameters under some realistic conditions. Out of many ratio, product and regression methods of estimation are good examples in this context. Using the knowledge of kurtosis of an auxiliary variable Upadhyaya and Singh (1999) has suggested an estimator for population variance. In this paper, following the approach of Singh and Singh (1993), we have suggested almost unbiased ratio and product-type estimators for population variance.
Category: Statistics

[11] viXra:1004.0054 [pdf] submitted on 8 Mar 2010

Alternatives To Pearson's and Spearman's Correlation Coefficients

Authors: Florentin Smarandache
Comments: 9 pages

This article presents several alternatives to Pearson's correlation coefficient and many examples. In the samples where the rank in a discrete variable counts more than the variable values, the mixture of Pearson's and Spearman's gives a better result.
Category: Statistics

[10] viXra:1003.0183 [pdf] submitted on 6 Mar 2010

A Note On Testing Of Hypothesis

Authors: Rajesh Singh, Jayant Singh, Florentin Smarandache
Comments: 5 pages

In this paper problem of testing of hypothesis is discussed when the samples have been drawn from normal distribution. The study of hypothesis testing is also extended to Baye's set up.
Category: Statistics

[9] viXra:1003.0172 [pdf] submitted on 6 Mar 2010

A General Family of Estimators for Estimating Population Mean Using Known Value of Some Population Parameter(s)

Authors: M. Khoshnevisan, Rajesh Singh, Pankaj Chauhan, Nirmala Sawan, Florentin Smarandache
Comments: 11 pages

A general family of estimators for estimating the population mean of the variable under study, which make use of known value of certain population parameter(s), is proposed. Under Simple Random Sampling Without Replacement (SRSWOR) scheme, the expressions of bias and mean-squared error (MSE) up to first order of approximation are derived. Some well known estimators have been shown as particular member of this family. An empirical study is carried out to illustrate the performance of the constructed estimator over others.
Category: Statistics

[8] viXra:1003.0137 [pdf] submitted on 6 Mar 2010

Empirical Study in Finite Correlation Coefficient in Two Phase Estimation

Authors: M. Khoshnevisan, F. Kaymram, Housila P. Singh, Rajesh Singh, Florentin Smarandache
Comments: 10 pages

This paper proposes a class of estimators for population correlation coefficient when information about the population mean and population variance of one of the variables is not avaliable but information about these parameters of another variable (auxiliary) is avaliable, in two phase sampling and analyzes its properties. Optimum estimator in the class is identified with its variance formula. The estimators of the class involve unknown constants whose optimum values depend on unknown population parameters.Following Singh (1982) and Srivastava and Jhajj (1983), it has been shown that when these population parameters are replaced by their consistent estimates the resulting class of estimators has the same asymptotic variance as that of optimum estimator. An empirical study is carried out to demonstrate the performance of the constructed estimators.
Category: Statistics

[7] viXra:1003.0136 [pdf] submitted on 6 Mar 2010

Econometric Analysis on Efficiency of Estimator

Authors: M. Khoshnevisan, F. Kaymram, Housila P. Singh, Rajesh Singh, Florentin Smarandache
Comments: 11 pages

This paper investigates the efficiency of an alternative to ratio estimator under the super population model with uncorrelated errors and a gammadistributed auxiliary variable. Comparisons with usual ratio and unbiased estimators are also made.
Category: Statistics

[6] viXra:1003.0130 [pdf] submitted on 6 Mar 2010

A Family of Estimators of Population Mean Using Multiauxiliary Information in Presence of Measurement Errors

Authors: Jack Allen, Housila P. Singh, Florentin Smarandache
Comments: 16 pages

This paper proposes a family of estimators of population mean using information on several auxiliary variables and analyzes its properties in the presence of measurement errors.
Category: Statistics

[5] viXra:1003.0128 [pdf] submitted on 6 Mar 2010

Estimation of Weibull Shape Parameter by Shrinkage Towards an Interval Under Failure Censored Sampling

Authors: Housila P. Singh, Sharad Saxena, Jack Allen, Sarjinder Singh, Florentin Smarandache
Comments: 20 pages

This paper is speculated to propose a class of shrinkage estimators for shape parameter β in failure censored samples from two-parameter Weibull distribution when some 'apriori' or guessed interval containing the parameter β is available in addition to sample information and analyses their properties. Some estimators are generated from the proposed class and compared with the minimum mean squared error (MMSE) estimator. Numerical computations in terms of percent relative efficiency and absolute relative bias indicate that certain of these estimators substantially improve the MMSE estimator in some guessed interval of the parameter space of β, especially for censored samples with small sizes. Subsequently, a modified class of shrinkage estimators is proposed with its properties.
Category: Statistics

[4] viXra:1003.0113 [pdf] submitted on 6 Mar 2010

Improvement in Estimating the Population Mean Using Exponential Estimator in Simple Random Sampling

Authors: Rajesh Singh, Pankaj Chauhan, Nirmala Sawan, Florentin Smarandache
Comments: 6 pages

This study proposes some exponential ratio-type estimators for estimating the population mean of the variable under study ... (see paper for full abstract)
Category: Statistics

[3] viXra:1003.0109 [pdf] submitted on 6 Mar 2010

A General Class of Estimators of Population Median Using Two Auxiliary Variables in Double Sampling

Authors: Jack Allen, Housila P. Singh, Sarjinder Singh, Florentin Smarandache
Comments: 21 pages

In this paper we have suggested two classes of estimators for population median MY of the study character Y using information on two auxiliary characters X and Z in double sampling. It has been shown that the suggested classes of estimators are more efficient than the one suggested by Singh et al (2001). Estimators based on estimated optimum values have been also considered with their properties. The optimum values of the first phase and second phase sample sizes are also obtained for the fixed cost of survey.
Category: Statistics

[2] viXra:1003.0092 [pdf] submitted on 6 Mar 2010

Auxiliary Information and a Priori Values in Construction of Improved Estimators

Authors: Rajesh Singh, Pankaj Chauhan, Nirmala Sawan, Florentin Smarandache
Comments: 75 pages

This volume is a collection of six papers on the use of auxiliary information and a priori values in construction of improved estimators. The work included here will be of immense application for researchers and students who employ auxiliary information in any form.
Category: Statistics

[1] viXra:1003.0091 [pdf] submitted on 6 Mar 2010

Ratio Estimators in Simple Random Sampling Using Information on Auxiliary Attribute

Authors: Rajesh Singh, Pankaj Chauhan, Nirmala Sawan, Florentin Smarandache
Comments: 7 pages

Some ratio estimators for estimating the population mean of the variable under study, which make use of information regarding the population proportion possessing certain attribute, are proposed. Under simple random sampling without replacement (SRSWOR) scheme, the expressions of bias and mean-squared error (MSE) up to the first order of approximation are derived. The results obtained have been illustrated numerically by taking some empirical population considered in the literature.
Category: Statistics

Recent Replacements

[17] viXra:1301.0031 [pdf] replaced on 2013-03-06 20:13:47

On the Convergence of the Metropolis-Hastings Markov Chains

Authors: Dimiter Tsvetkov, Lyubomir Hristov, Ralitsa Angelova-Slavova
Comments: 14 Pages.

In this paper we consider Markov chains associated with the Metropolis-Hastings algorithm. We propose conditions under which the sequence of the successive densities of such a chain converges to the target density according to the total variation distance for any choice of the initial density. In particular we prove that the positiveness of the target and the proposal densities is enough for the chain to converge.
Category: Statistics

[16] viXra:1301.0031 [pdf] replaced on 2013-03-01 09:56:38

On the Convergence of the Metropolis-Hastings Markov Chains

Authors: Dimiter Tsvetkov, Lyubomir Hristov, Ralitsa Angelova-Slavova
Comments: 15 Pages.

In this paper we consider Markov chains associated with the Metropolis-Hastings algorithm. We propose conditions under which the sequence of the successive densities of such a chain converges to the target density according to the total variation distance for any choice of the initial density. In particular we prove that the positiveness of the target and the proposal densities is enough for the chain to converge.
Category: Statistics

[15] viXra:1301.0031 [pdf] replaced on 2013-02-04 04:29:28

On the Convergence of the Metropolis-Hastings Markov Chains

Authors: Dimiter Tsvetkov, Lyubomir Hristov, Ralitsa Angelova-Slavova
Comments: 14 Pages.

In this paper we consider Markov chains associated with the Metropolis-Hastings algorithm. We show that under some very general conditions the sequence of the powers of the conjugate transition operator has a strong limit in a properly defined Hilbert space described for example in Stroock (2005). Then we propose conditions under which the sequence of the successive densities of such a chain converges to the target density according to the total variation distance for any choice of the initial density. In particular we prove that the positiveness of the target and the proposal densities is enough for the chain to converge.
Category: Statistics

[14] viXra:1211.0094 [pdf] replaced on 2013-01-30 12:59:09

Point Process Models for Multivariate High-Frequency Irregularly Spaced Data

Authors: Stephen Crowley
Comments: 41 Pages.

Definitions from the theory of point processes are recalled. Models of intensity function parametrization and maximum likelihood estimation from data are explored. Closed-form log-likelihood expressions are given for the (exponential) Hawkes (univariate and multivariate) process, Autoregressive Conditional Duration(ACD), with both exponential and Weibull distributed errors, and a hybrid model combining the ACD and the exponential Hawkes models. Formulas are also derived, however without the elegant recursions of the exponential kernels, for kernels of the Weibull and Gamma type and comparison of the Weibullfit vs exponential kernel fits viaQQand probability plots are provided. The additional complexity of the Hawkes-Weibull or the ACD-Hawkes appears to not be worth the tradeoff. Diurnal, or daily, adjustment of the deterministic predictable part of the intensity variation via piecewise polynomial splines is discussed. Data from the symbol SPY on three different electronic markets is used to estimate model parameters and generate illustrative plots. The parameters were estimated without diurnal adjustments, a repeat of the analysis with adjustments is due in a future version of this article. The connection of the Hawkes process to quantum theory is briefly mentioned. Prediction of the next point of a Hawkes process is briefly discussed and a closed-form expression in terms of the Lambert W function for the standard exponential kernel with P=1 is calculated.
Category: Statistics

[13] viXra:1211.0094 [pdf] replaced on 2013-01-12 16:33:02

Point Process Models for Multivariate High-Frequency Irregularly Spaced Data

Authors: Stephen Crowley
Comments: 34 Pages.

Definitions from the theory of point processes are recalled. Models of intensity function parametrization and maximum likelihood estimation from data are explored. Closed-form log-likelihood expressions are given for the (exponential) Hawkes (univariate and multivariate)process, Autoregressive Conditional Duration(ACD), with both exponential andWeibull distributed errors, and a hybrid model combining the ACD and the exponential Hawkes models. Formulas are also derived, however without the elegant recursions of the exponential kernels, for kernels of the Weibull and Gamma type and comparison of the Weibull fit vs exponential kernel fits via QQ and probability plots are provided. The additional complexity of the Hawkes-Weibull or the ACD-Hawkes appears to not be worth the tradeoff. Diurnal, or daily, adjustment of the deterministic predictable part of the intensity variation via piecewise polynomial splines is discussed. Data from the symbol SPY on three different electronic markets is used to estimate model parameters and generate illustrative plots. The connection of the Hawkes process to quantum theory is briefly mentioned.
Category: Statistics

[12] viXra:1211.0094 [pdf] replaced on 2012-12-31 16:05:15

Point Process Models for Multivariate High-Frequency Irregularly Spaced Data

Authors: Stephen Crowley
Comments: 23 Pages.

Definitions from the theory of point processes are recalled. Models of intensity function parametrization and maximum likelihood estimation from data are explored. Closed-form log-likelihood expressions are given for the Hawkes (univariate and multivariate)process, Autoregressive Conditional Duration(ACD), with both exponential and Weibull distributed errors, and a hybrid model combining the ACD and the Hawkes models. Diurnal, or daily, adjustment of the deterministic predictable part of the intensity variation via piecewise polynomial splines is discussed. Data from the symbol SPY on three different electronic markets is used to estimate model parameters and generate illustrative plots. The parameters were estimated without diurnal adjustments, a repeat of the analysis with adjustments is due in a future version of this article. The connection of the Hawkes process to quantum theory is briefly mentioned. The Hawkes process with a Weibull kernel is also briefly mentioned and will be explored more in the future.
Category: Statistics

[11] viXra:1211.0094 [pdf] replaced on 2012-12-12 10:24:21

Point Process Models for Multivariate High-Frequency Irregularly Spaced Data

Authors: Stephen Crowley
Comments: 19 Pages.

Definitions from the theory of point processes are recalled. Models of intensity function parameterization and maximum likelihood estimation from data are explored. Closed-form log-likelihood expressions are given for the Hawkes (univariate and multivariate)process, Autoregressive Conditional Duration(ACD) and a hybrid model combining the ACD and the Hawkes models. Diurnal, or daily, adjustment of the deterministic predictable part of the intensity variation via piecewise polynomial splines is discussed. Data from the symbol SPY on three different electronic markets is used to estimate model parameters and generate illustrative plots. The parameters were estimated without diurnal adjustments, a repeat of the analysis with adjustments is due in a future version of this article. The connection of the Hawkes process to quantum theory is briefly mentioned.
Category: Statistics

[10] viXra:1211.0094 [pdf] replaced on 2012-11-29 12:25:23

Point Process Models for Multivariate High-Frequency Irregularly Spaced Data

Authors: Stephen Crowley
Comments: 16 Pages.

Definitions from the theory of point processes are recalled. Models of intensity function parametrization and maximum likelihood estimation from data are explored. Closed-form log-likelihood expressions are given for the Hawkes (univariate and multivariate) process, Autoregressive Conditional Duration(ACD) and a hybrid model combining the ACD and the Hawkes models. Data from the symbol SPY on three different electronic markets is used to estimate model parameters and generate illustrative plots.
Category: Statistics

[9] viXra:1211.0094 [pdf] replaced on 2012-11-22 14:48:59

Point Process Models for Multivariate High-Frequency Irregularly Spaced Data

Authors: Stephen Crowley
Comments: 13 Pages.

Definitions from the theory of point processes are recalled. Models of intensity function paramaterization and maximum likelihood estimation from data are explored. Closed-formlog-likelihood expressions are given for the Hawkes (unidimensional andmultidimensional)process, Autoregressive Conditional Duration(ACD), and Log-ACD models. The Autoregressive Conditional Intensity model is also discussed. Data from the symbol SPY on the Nasdaq stock market on Oct 22nd, 2012 is used to estimate model parameters and generate illustrative plots.
Category: Statistics

[8] viXra:1211.0094 [pdf] replaced on 2012-11-19 18:25:11

Point Process Models for Multivariate High-Frequency Irregularly Spaced Data

Authors: Stephen Crowley
Comments: 8 Pages.

Definitions from the theory of point processes are recalled. Models of intensity function paramaterization and maximum likelihood estimation from data are explored. Closed-form log-likelihood expressions are given for the Hawkes process, Autoregressive Conditional Duration(ACD), and Log-ACD models. The Autoregressive Conditional Intensity model is also discussed. Data from the symbol SPY on the Nasdaq stock market on Oct 22nd, 2012 is used to estimate model parameters and generate illustrative plots.
Category: Statistics

[7] viXra:1111.0073 [pdf] replaced on 2012-06-15 13:36:51

Kullback-Leibler Simplex

Authors: Popon Kangpenkae
Comments: 12 Pages.

Abstract. This technical reference presents the functional structure and the algorithmic implementation of KL (Kullback-Leibler) simplex. It details the simplex approximation and fusion. The KL simplex is fundamental, robust, adaptive an informatics agent for computational research in economics, finance, game and mechanism. From this perspective the study provides comprehensive results to facilitate future work in such areas. Abstract.
Category: Statistics

[6] viXra:1111.0073 [pdf] replaced on 2011-11-25 09:15:23

Kullback-Leibler Simplex

Authors: Popon Kangpenkae
Comments: 12 Pages.

This technical reference presents the functional structure and the algorithmic implementation of KL (Kullback-Leibler) simplex. It details the simplex approximation and fusion. The KL simplex is fundamental, robust, adaptive an informatics agent for computational research in economics, finance, game and mechanism. From this perspective the study provides comprehensive results to facilitate future work in such areas.
Category: Statistics

[5] viXra:1111.0073 [pdf] replaced on 25 Nov 2011

Kullback-Leibler Simplex

Authors: Popon Kangpenkae
Comments: 12 pages

This technical reference presents the functional structure and the algorithmic implementation of KL (Kullback-Leibler) simplex. It details the simplex approximation and fusion. The KL simplex is fundamental, robust, adaptive an informatics agent for computational research in economics, finance, game and mechanism. From this perspective the study provides comprehensive results to facilitate future work in such areas.
Category: Statistics

[4] viXra:1007.0034 [pdf] replaced on 2012-01-11 19:14:18

On the Gini Mean Difference Test for Circular Data

Authors: David D. Tung, S. Rao Jammalamadaka
Comments: 14 Pages.

In this paper, we propose a new test of uniformity on the circle based on the Gini mean difference of the sample arc-lengths. These sample arc-lengths, which are the gaps between successive observations on the circumference of the circle, are analogous to sample spacings on the real line. The Gini mean difference, which compares these arc-lengths between themselves, is analogous to Rao's spacings statistic, which has been used to test the uniformity of circular data. We obtain both the exact and asymptotic distributions of the Gini mean difference arc-lengths test, under the null hypothesis of circular uniformity. We also provide a table of upper percentile values of the exact distribution for small to moderate sample sizes. Illustrative examples in circular data analysis are also given. It is shown that a generalized Gini mean difference test has better asymptotic efficiency than the corresponding generalized Rao's test in the sense of Pitman asymptotic relative efficiency.
Category: Statistics

[3] viXra:1007.0034 [pdf] replaced on 16 Aug 2010

On the Gini Mean Difference Arc-Lengths Test for Circular Data

Authors: David D. Tung, S. Rao Jammalamadaka
Comments: 14 pages.

In this paper, we propose a new test of uniformity on the circle based on the Gini mean difference of the sample arc-lengths. These sample arc-lengths, which are the gaps between successive observations on the circumference of the circle, are analogous to sample spacings on the real line. The Gini mean difference, which compares these arc-lengths between themselves, is analogous to Rao's spacings statistic, which has been used to test the uniformity of circular data. We obtain both the exact and asymptotic distributions of the Gini mean difference arc-lengths test, under the null hypothesis of circular uniformity. We also provide a table of upper percentile values of the exact distribution for small to moderate sample sizes. Illustrative examples in circular data analysis are also given. It is shown that a generalized Gini mean difference test has better asymptotic efficiency than the corresponding generalized Rao's test in the sense of Pitman asymptotic relative efficiency.
Category: Statistics

[2] viXra:1006.0046 [pdf] replaced on 2012-01-11 19:16:15

U-Statistics Based on Spacings

Authors: David D. Tung, S. Rao Jammalamadaka
Comments: 23 Pages.

In this paper, we investigate the asymptotic theory for U-statistics based on sample spacings, i.e. the gaps between successive observations. The usual asymptotic theory for U-statistics does not apply here because spacings are dependent variables. However, under the null hypothesis, the uniform spacings can be expressed as conditionally independent Exponential random variables. We exploit this idea to derive the relevant asymptotic theory both under the null hypothesis and under a sequence of close alternatives. The generalized Gini mean difference of the sample spacings is a prime example of a U-statistic of this type. We show that such a Gini spacings test is analogous to Rao's spacings test. We find the asymptotically locally most powerful test in this class, and it has the same efficacy as the Greenwood statistic.
Category: Statistics

[1] viXra:1006.0046 [pdf] replaced on 16 Aug 2010

U-Statistics Based on Spacings

Authors: David D. Tung, S. Rao Jammalamadaka
Comments: 23 pages.

In this paper, we investigate the asymptotic theory for U-statistics based on sample spacings, i.e. the gaps between successive observations. The usual asymptotic theory for U-statistics does not apply here because spacings are dependent variables. However, under the null hypothesis, the uniform spacings can be expressed as conditionally independent Exponential random variables. We exploit this idea to derive the relevant asymptotic theory both under the null hypothesis and under a sequence of close alternatives. The generalized Gini mean difference of the sample spacings is a prime example of a U-statistic of this type. We show that such a Gini spacings test is analogous to Rao's spacings test. We find the asymptotically locally most powerful test in this class, and it has the same efficacy as the Greenwood statistic.
Category: Statistics