Reports private alleles (and fixed alleles) per pair of populations
This function reports private alleles in one population compared with a second population, for all populations taken pairwise. It also reports a count of fixed allelic differences and the mean absolute allele frequency differences (AFD) between pairs of populations.
gl.report.pa( x, x2 = NULL, method = "pairwise", loc_names = FALSE, plot.out = TRUE, font_plot = 14, map.interactive = FALSE, palette_discrete = discrete_palette, save2tmp = FALSE, verbose = NULL )
Name of the genlight object containing the SNP data [required].
If two separate genlight objects are to be compared this can be provided here, but they must have the same number of SNPs [default NULL].
Method to calculate private alleles: 'pairwise' comparison or compare each population against the rest 'one2rest' [default 'pairwise'].
Whether names of loci with private alleles and fixed differences should reported. If TRUE, loci names are reported using a list [default FALSE].
Specify if Sankey plot is to be produced [default TRUE].
Numeric font size in pixels for the node text labels [default 14].
Specify whether an interactive map showing private alleles between populations is to be produced [default FALSE].
A discrete palette for the color of populations or a list with as many colors as there are populations in the dataset [default discrete_palette].
If TRUE, saves any ggplots and listings to the session temporary directory (tempdir) [default FALSE].
Verbosity: 0, silent, fatal errors only; 1, flag function begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2 or as specified using gl.set.verbosity].
A data.frame. Each row shows, for each pair of populations the number of individuals in each population, the number of loci with fixed differences (same for both populations) in pop1 (compared to pop2) and vice versa. Same for private alleles and finally the absolute mean allele frequency difference between loci (AFD). If loc_names = TRUE, loci names with private alleles and fixed differences are reported in a list in addition to the dataframe.
Note that the number of paired alleles between two populations is not a symmetric dissimilarity measure.
If no x2 is provided, the function uses the pop(gl) hierarchy to determine pairs of populations, otherwise it runs a single comparison between x and x2.
Hint: in case you want to run comparisons between individuals (assuming individual names are unique), you can simply redefine your population names with your individual names, as below:
pop(gl) <- indNames(gl)
Definition of fixed and private alleles
The table below shows the possible cases of allele frequencies between two populations (0 = homozygote for Allele 1, x = both Alleles are present, 1 = homozygote for Allele 2).
p: cases where there is a private allele in pop1 compared to pop2 (but not vice versa)
f: cases where there is a fixed allele in pop1 (and pop2, as those cases are symmetric)
The absolute allele frequency difference (AFD) in this function is a simple differentiation metric displaying intuitive properties which provides a valuable alternative to FST. For details about its properties and how it is calculated see Berner (2019).
The function also reports an estimation of the lower bound of the number of undetected private alleles using the Good-Turing frequency formula, originally developed for cryptography, which estimates in an ecological context the true frequencies of rare species in a single assemblage based on an incomplete sample of individuals. The approach is described in Chao et al. (2017). For this function, the equation 2c is used. This estimate is reported in the output table as Chao1 and Chao2.
In this function a Sankey Diagram is used to visualize patterns of private alleles between populations. This diagram allows to display flows (private alleles) between nodes (populations). Their links are represented with arcs that have a width proportional to the importance of the flow (number of private alleles).
if save2temp=TRUE, resultant plot(s) and the tabulation(s) are saved to the session's temporary directory.
Berner, D. (2019). Allele frequency difference AFD – an intuitive alternative to FST for quantifying genetic population differentiation. Genes, 10(4), 308.
Chao, Anne, et al. "Deciphering the enigma of undetected species, phylogenetic, and functional diversity based on Good-Turing theory." Ecology 98.11 (2017): 2914-2929.
Other report functions:
Custodian: Bernd Gruber -- Post to https://groups.google.com/d/forum/dartr
out <- gl.report.pa(platypus.gl) #> Starting gl.report.pa #> Processing genlight object with SNP data #> Warning: data include loci that are scored NA across all individuals. #> Consider filtering using gl <- gl.filter.allna(gl) #> p1 p2 pop1 pop2 N1 N2 fixed priv1 priv2 Chao1 Chao2 totalpriv #> 1 1 2 SEVERN_ABOVE SEVERN_BELOW 23 17 0 93 59 25 2 152 #> 2 1 3 SEVERN_ABOVE TENTERFIELD 23 41 0 49 137 24 25 186 #> 3 2 3 SEVERN_BELOW TENTERFIELD 17 41 0 37 159 3 25 196 #> AFD #> 1 0.063 #> 2 0.069 #> 3 0.074 #> Table of private alleles and fixed differences returned #> Completed: gl.report.pa #>