This script takes SNP genotypes from a csv file, combines them with individual and locus metrics and creates a genlight object.
Usage
gl.read.csv(
filename,
transpose = FALSE,
ind.metafile = NULL,
loc.metafile = NULL,
verbose = NULL
)
Arguments
- filename
Name of the csv file containing the SNP genotypes [required].
- transpose
If TRUE, rows are loci and columns are individuals [default FALSE].
- ind.metafile
Name of the csv file containing the metrics for individuals [optional].
- loc.metafile
Name of the csv file containing the metrics for loci [optional].
- verbose
Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2 or as specified using gl.set.verbosity].
Details
The SNP data need to be in one of two forms. SNPs can be coded 0 for homozygous reference, 2 for homozygous alternate, 1 for heterozygous, and NA for missing values; or the SNP data can be coded A/A, A/C, C/T, G/A etc, and -/- for missing data. In this format, the reference allele is the most frequent allele, as used by DArT. Other formats will throw an error.
The SNP data need to be individuals as rows, labeled, and loci as columns, also labeled. If the orientation is individuals as columns and loci by rows, then set transpose=TRUE.
The individual metrics need to be in a csv file, with headings, with a mandatory id column corresponding exactly to the individual identity labels provided with the SNP data and in the same order.
The locus metadata needs to be in a csv file with headings, with a mandatory column headed AlleleID corresponding exactly to the locus identity labels provided with the SNP data and in the same order.
Note that the locus metadata will be complemented by calculable statistics corresponding to those that would be provided by Diversity Arrays Technology (e.g. CallRate).
Author
Custodian: Luis Mijangos – Post to https://groups.google.com/d/forum/dartr
Examples
csv_file <- system.file('extdata','platy_test.csv', package='dartR')
ind_metadata <- system.file('extdata','platy_ind.csv', package='dartR')
gl <- gl.read.csv(filename = csv_file, ind.metafile = ind_metadata)
#> Starting gl.read.csv
#> Warning: Locus metafile not provided, locus metrics will be
#> calculated where this is possible
#> Input data should be a csv file with individuals as rows, loci as columns
#> 6 loci, confirming first 5: loci1 loci2 loci3 loci4 loci5
#> 13 individuals, confirming first 5: T158 T306 T305 T148 T149
#> If these are reversed, re-run the script with transpose=TRUE
#> Character data detected, assume genotypes are of the
#> form C/C, A/T, C/G, -/- etc
#> Data confirmed as biallelic
#> SNP coding converted to 0, 1, 2 and NA
#> Starting gl.compliance.check
#> Processing genlight object with SNP data
#> The slot loc.all, which stores allele name for each locus, is empty.
#> Creating a dummy variable (A/C) to insert in this slot.
#> Checking coding of SNPs
#> SNP data scored NA, 0, 1 or 2 confirmed
#> Checking locus metrics and flags
#> Recalculating locus metrics
#> Checking for monomorphic loci
#> Dataset contains monomorphic loci
#> Checking for loci with all missing data
#> No loci with all missing data detected
#> Checking whether individual names are unique.
#> Checking for individual metrics
#> Warning: Creating a slot for individual metrics
#> Checking for population assignments
#> Population assignments confirmed
#> Spelling of coordinates checked and changed if necessary to
#> lat/lon
#> Completed: gl.compliance.check
#> Added or updated array(NA, nLoc(x)) to the other$ind.metrics slot.
#> Added or updated AvgPIC to the other$ind.metrics slot.
#> Added or updated OneRatioRef to the other$ind.metrics slot.
#> Added or updated OneRatioSnp to the other$ind.metrics slot.
#> Added or updated PICRef to the other$ind.metrics slot.
#> Added or updated PICSnp to the other$ind.metrics slot.
#> Added or updated CallRate to the other$ind.metrics slot.
#> Added or updated FreqHomRef to the other$ind.metrics slot.
#> Added or updated FreqHomSnp to the other$ind.metrics slot.
#> Added or updated FreqHets to the other$ind.metrics slot.
#> Added or updated monomorphs to the other$ind.metrics slot.
#> Added or updated maf to the other$ind.metrics slot.
#> Added or updated OneRatio to the other$ind.metrics slot.
#> Added or updated PIC to the other$ind.metrics slot.
#> Added id to the other$ind.metrics slot.
#> Added pop to the other$ind.metrics slot.
#> Added lat to the other$ind.metrics slot.
#> Added long to the other$ind.metrics slot.
#> Added group to the other$ind.metrics slot.
#> Added age to the other$ind.metrics slot.
#> Starting gl.compliance.check
#> Processing genlight object with SNP data
#> Checking coding of SNPs
#> SNP data scored NA, 0, 1 or 2 confirmed
#> Checking locus metrics and flags
#> Recalculating locus metrics
#> Checking for monomorphic loci
#> Dataset contains monomorphic loci
#> Checking for loci with all missing data
#> No loci with all missing data detected
#> Checking whether individual names are unique.
#> Checking for individual metrics
#> Individual metrics confirmed
#> Checking for population assignments
#> Population assignments confirmed
#> Spelling of coordinates checked and changed if necessary to
#> lat/lon
#> Completed: gl.compliance.check
#> Completed: gl.read.csv
#>