This document explains the basic functionality of the Jacquard package which provides functions for the estimation of the nine condensed Jacquard coefficients (Jacquard (1974)) for biallelic genetic marker data using a constrained least squares approach (Graffelman, Weir, and Goudet (2024)).
Outline:
We illustrate the estimation of the Jacquard coefficients using
(0,1,2) coded genotype data available in the data set
SimulatedPedigree
, of which we show a small part
data(SimulatedPedigree)
SimulatedPedigree[1:5,1:10]
#> Family ID Father Mother Sex SNP00001 SNP00002 SNP00003 SNP00004 SNP00005
#> ID001 1 1 0 0 1 0 0 0 0 1
#> ID002 1 2 0 0 1 1 2 0 0 0
#> ID003 1 3 0 0 1 0 0 0 0 0
#> ID004 1 4 0 0 1 0 1 0 0 0
#> ID005 1 5 0 0 1 0 1 1 0 0
This matrix contains 111 individuals, spanning seven generations, in its rows and pedigree information plus genotype data of 20,000 SNPs in its columns. We first separate the pedigree information from the genotype information.
We next determine the nine joint genotype counts for all pairs of
individuals, storing these counts in a list object with nine lower
triangular matrices, using function JointGenotypeCounts
.
For efficiency, we here load the precalculated joint counts.
#GTC <- JointGenotypeCounts(Xgen)
data(GTC)
names(GTC)
#> [1] "f0000" "f1111" "f1101" "f0111" "f0101" "f1100" "f0011" "f0100" "f0001"
E.g., the counts of the major homozygote pairs for the first five individuals are
GTC[[1]][1:5,1:5]
#> ID001 ID002 ID003 ID004 ID005
#> ID001 16466 0 0 0 0
#> ID002 13987 16507 0 0 0
#> ID003 13982 13965 16440 0 0
#> ID004 13926 13951 13903 16449 0
#> ID005 14000 14063 13977 13979 16556
We create a vector with the minor allele frequency of each SNP.
mafvec <- mafvector(Xgen)
mafvec[1:5]
#> SNP00001 SNP00002 SNP00003 SNP00004 SNP00005
#> 0.004504505 0.063063063 0.045045045 0.000000000 0.013513514
We proceed to estimate the Jacquard coefficients by constrained least
squares with function Jacquard.cls
. For the sake of
illustration, we take the first three founders of the pedigree, and
subset their joint genotype counts
ii <- 1:3
Xped[ii,]
#> Family ID Father Mother Sex
#> ID001 1 1 0 0 1
#> ID002 1 2 0 0 1
#> ID003 1 3 0 0 1
GTCsubset <- list(length = 9)
for (k in 1:9) {
GTCsubset[[k]] <- matrix(numeric(3^2), ncol = 3)
GTCsubset[[k]] <- GTC[[k]][ii,ii]
}
We set random initial values for the Jacquard coefficients
And estimate the pairwise Jacquard coefficients of the three pairs.
output <- Jacquard.cls(GTCsubset,mafvec=mafvec,
eps=1e-06,
delta.init=delta.init)
#> 3 pairs 20000 SNPs
#> Processing 1 out of 3
#> Processing 2 out of 3
#> Processing 3 out of 3
Delta.cls <- output$delta
Convergence of the solver can be checked by looking at the field
convergence
, where 0 indicates proper convergence.
Particular estimates of Jacquard coefficients can be extracted from
the Delta.cls
list object, which is a list of nine
matrices. E.g., \(\Delta_9\) of the
first pair of individuals (1,2) can be extracted by
All nine estimates of the Jacquard coefficients can be obtained with
function DeltaPair
DeltaPair(Delta.cls,1,2)
#> J1 J2 J3 J4 J5 J6 J7 J8
#> 0.0000000 0.0000000 0.0000000 0.0000001 0.0000000 0.0000001 0.0000000 0.0000000
#> J9
#> 0.9999998
A pairwise list of the nine Jacquard coefficients of each pair can be
obtained with the function PairwiseList
.
PairwiseList(Delta.cls)
#> J1 J2 J3 J4 J5 J6 J7 J8 J9
#> ID001-ID002 0 0 0 0 0 0 0 0 1
#> ID001-ID003 0 0 0 0 0 0 0 0 1
#> ID002-ID003 0 0 0 0 0 0 0 0 1
The full set of all pairwise Jacquard coefficients can be calculated
with the instructions below. This result is also available in the
precalculated data object DeltaSimulatedPedigree
.
#DeltaSimulatedPedigree <- Jacquard.cls(GTC,mafvec=mafvec,
# eps=1e-06,
# delta.init=delta.init)$delta
data(DeltaSimulatedPedigree)
Boxplots of the estimated Jacquard coefficients can be obtained with
function BoxplotDelta
. Separate boxplots are shown for the
diagonals of \(\Delta_1\) and \(\Delta_7\).
There are many estimators for coancestry and inbreeding. Function
CalculateThetaMom
calculates moment estimators for the five
identifiable relatedness parameters defined by Csűrös (2014).
E.g., moment estimators of the kinship coefficients of the first five individuals are given by
KS.mom[[1]][1:5,1:5]
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0.4311849 -0.1443478 -0.1436215 -0.1557858 -0.1494313
#> [2,] -0.1443478 0.4337267 -0.1438031 -0.1526993 -0.1312757
#> [3,] -0.1436215 -0.1438031 0.4270091 -0.1645005 -0.1521547
#> [4,] -0.1557858 -0.1526993 -0.1645005 0.4059486 -0.1546964
#> [5,] -0.1494313 -0.1312757 -0.1521547 -0.1546964 0.4279169