A utility script to calculate the number of variant and invariant sites by locus
Source:R/utils.n.var.invariant.r
utils.n.var.invariant.Rd
Calculate the number of variant and invariant sites by locus and add them as
columns in loc.metrics
. This can be useful to conduct further
filtering, for example where only loci with secondaries are wanted for
phylogenetic analyses.
Details
Invariant sites are the sites (nucleotide) that are not polymorphic. When the
locus metadata supplied by DArT includes the sequence of the allele
(TrimmedSequence
), it is used by this function to estimate the number
of sites that were sequenced in each tag (read). This script then subtracts
the number of polymorphic sites. The length of the trimmed sequence
(lenTrimSeq), the number of variant (n.variant) and
invariant (n.invariant) sites are the added to the table in
gl@others$loc.metrics
.
NOTE: It is important to realise that this function correctly
estimates the number of variant and invariant sites only when it is executed on
genlight
objects before secondaries are removed.
Author
Carlo Pacioni (Post to https://groups.google.com/d/forum/dartr)
Examples
require("dartR.data")
out <- utils.n.var.invariant(platypus.gl)
#> Starting utils.n.var.invariant
#> Processing genlight object with SNP data
#> Warning: data include loci that are scored NA across all individuals.
#> Consider filtering using gl <- gl.filter.allna(gl)
#> Calculating n invariant sites
#> Completed: utils.n.var.invariant
#>