Combining Genetic Similarities Among Known Relatives that Connect to a Pair of Unknown Relatives

Authors: Stephen P. Smith, Cambrian Lopez, Nicole Lam
Smith, Lopez and Lam described how to combine genetic similarities, measured in centimorgans (cM), among declared relatives in an outside pedigree, and to concentrate those cM values into a single cM measurement for an envoy that is a representative of the outside pedigree. An unknown relative is presumed to be a descendant of the envoy, but has the cM values with relatives in the outside pedigree. That prior effort was a univariate analysis, where there is only one unknown relative with matches with others in the outside pedigree. The present paper presents a bivariate analysis, where there are two sisters that have matches with others in the outside pedigree. The cM values are now paired, where any DNA tested member of the pedigree has two cM values that match to both sisters. The bivariate analysis offers more efficient use of information, compared to two univariate analyses done for each sister in turn. This advantage comes with an increase in model complexity, in that a model is developed for treating three mutually exclusive categories representing genes found in the sisters: for genes in the first sister but not in the second sister; genes common to both sisters; or genes in the second sister but not in the first. The model is applied to the inheritance of the cM values in the pedigree. Even though the number of random effects is increased by a factor of three, the number of fixed effects that actually spend two degrees of freedom is unchanged from the univariate analysis. This is on top of the doubling of the number of observations for the bivariate analysis compared to one univariate analysis.
Combining Genetic Similarities Among Known Relatives that Connect to an Unknown Relative

Authors: Stephen P. Smith, Cambrian Lopez, Nicole Lam
Various DNA testing companies promise their customers a collection of genetic matches to facilitate finding family members. The matches are in centimorgans (cM), where the higher the cM value the closer the relationship to a customer (R). Unless the relationship is close, such as parent-offspring or among 1st cousins, a single cM value is not that informative if the goal is to locate family. This paper describes a statistical method that combines a collection cM values from a cluster of unknown relatives of R, but where the cluster members are known among themselves being for example 2rd and 3th cousins. A presumed envoy is attached to the cluster, where R is a descendant of the envoy, and the various cM values are combined to provide an overall cM value between R and the envoy. The envoy’s cM comes with a statistical error to judge significance. Unlike a single cM value on a typical unknown relative, the envoy’s cM can be quite large and indicative of a real genetic path to R that has previously been undiscovered. This paper describes the method for two sisters, where the path from the envoy led to their lost father, a father that was later discovered.
