Visualize patterns of linkage disequilibrium and identification of haplotypes
This function plots a Linkage disequilibrium (LD) heatmap, where the colour shading indicates the strength of LD. Chromosome positions (Mbp) are shown on the horizontal axis, and haplotypes appear as triangles and delimited by dark yellow vertical lines. Numbers identifying each haplotype are shown in the upper part of the plot.
The heatmap also shows heterozygosity for each SNP.
The function identifies haplotypes based on contiguous SNPs that are in
linkage disequilibrium using as threshold
containing more than
gl.ld.haplotype( x, pop_name = NULL, chrom_name = NULL, ld_max_pairwise = 1e+07, maf = 0.05, ld_stat = "R.squared", ind.limit = 10, min_snps = 10, ld_threshold_haplo = 0.5, coordinates = NULL, color_haplo = "viridis", color_het = "deeppink", plot.out = TRUE, save2tmp = FALSE, verbose = NULL )
Name of the genlight object containing the SNP data [required].
Name of the population to analyse. If NULL all the populations are analised [default NULL].
Nme of the chromosome to analyse. If NULL all the chromosomes are analised [default NULL].
Maximum distance in number of base pairs at which LD should be calculated [default 10000000].
Minor allele frequency (by population) threshold to filter out loci. If a value > 1 is provided it will be interpreted as MAC (i.e. the minimum number of times an allele needs to be observed) [default 0.05].
The LD measure to be calculated: "LLR", "OR", "Q", "Covar", "D.prime", "R.squared", and "R". See
ld(package snpStats) for details [default "R.squared"].
Minimum number of individuals that a population should contain to take it in account to report loci in LD [default 10].
Minimum number of SNPs that should have a haplotype to call it [default 10].
Minimum LD between adjacent SNPs to call a haplotype [default 0.5].
A vector of two elements with the start and end coordinates in base pairs to which restrict the analysis e.g. c(1,1000000) [default NULL].
Color palette for haplotype plot. See details [default "viridis"].
Color for heterozygosity [default "deeppink"].
Specify if heatmap plot is to be produced [default TRUE].
If TRUE, saves any ggplots and listings to the session temporary directory (tempdir) [default FALSE].
Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2, unless specified using gl.set.verbosity].
The information for SNP's position should be stored in the genlight accessor "@position" and the SNP's chromosome name in the accessor "@chromosome" (see examples). The function will then calculate LD within each chromosome.
The output of the function includes a table with the haplotypes that were identified and their location.
Colors of the heatmap (
color_haplo) are based on the function
scale_fill_viridis from package
Other color palettes options are "magma", "inferno", "plasma", "viridis",
"cividis", "rocket", "mako" and "turbo".
Other ld functions:
Custodian: Luis Mijangos -- Post to https://groups.google.com/d/forum/dartr
require("dartR.data") x <- platypus.gl x <- gl.filter.callrate(x,threshold = 1) #> Starting gl.filter.callrate #> Processing genlight object with SNP data #> Warning: data include loci that are scored NA across all individuals. #> Consider filtering using gl <- gl.filter.allna(gl) #> Warning: Data may include monomorphic loci in call rate #> calculations for filtering #> Recalculating Call Rate #> Removing loci based on Call Rate, threshold = 1 #> #> Completed: gl.filter.callrate #> x <- gl.keep.pop(x, pop.list = "TENTERFIELD") #> Starting gl.keep.pop #> Processing genlight object with SNP data #> Checking for presence of nominated populations #> Retaining only populations TENTERFIELD #> Warning: Resultant dataset may contain monomorphic loci #> Locus metrics not recalculated #> Completed: gl.keep.pop #> x$chromosome <- as.factor(x$other$loc.metrics$Chrom_Platypus_Chrom_NCBIv1) x$position <- x$other$loc.metrics$ChromPos_Platypus_Chrom_NCBIv1 ld_res <- gl.ld.haplotype(x,chrom_name = "NC_041728.1_chromosome_1", ld_max_pairwise = 10000000 ) #> Starting gl.ld.haplotype #> Processing genlight object with SNP data #> Calculating pairwise LD in population TENTERFIELD #> Analysing chromosome NC_041728.1_chromosome_1 #> The maximum distance at which LD should be calculated #> (ld_max_pairwise) is too short for chromosome NC_041728.1_chromosome_1 . Setting this distance to 33814744 bp #> #> Regions defined for each Polygons #> No haplotypes were identified for chromosome NC_041728.1_chromosome_1 #> #> NULL #>  population chromosome haplotype start #>  end start_ld_plot end_ld_plot midpoint #>  midpoint_ld_plot labels #> <0 rows> (or 0-length row.names) #> Completed: gl.ld.haplotype #>