Imports presence/absence data from SilicoDArT to genlight {agegenet} format (ploidy=1)
Source:R/gl.read.silicodart.r
gl.read.silicodart.Rd
DaRT provide the data as a matrix of entities (individual animals) across the top and attributes (P/A of sequenced fragment) down the side in a format that is unique to DArT. This program reads the data in to adegenet format for consistency with other programming activity. The script may require modification as DArT modify their data formats from time to time.
Usage
gl.read.silicodart(
filename,
ind.metafile = NULL,
nas = "-",
topskip = NULL,
lastmetric = "Reproducibility",
probar = TRUE,
verbose = NULL
)
Arguments
- filename
Name of csv file containing the SilicoDArT data [required].
- ind.metafile
Name of csv file containing metadata assigned to each entity (individual) [default NULL].
- nas
Missing data character [default '-'].
- topskip
Number of rows to skip before the header row (containing the specimen identities) [optional].
- lastmetric
Specifies the last non genetic column (Default is 'Reproducibility'). Be sure to check if that is true, otherwise the number of individuals will not match. You can also specify the last column by a number [default "Reproducibility"].
- probar
Show progress bar [default TRUE].
- verbose
Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2, or as set by gl.set.verbose()].
Value
An object of class genlight
with ploidy set to 1, containing
the presence/absence data, and locus and individual metadata.
Details
gl.read.silicodart() opens the data file (csv comma delimited) and skips the first n=topskip lines. The script assumes that the next line contains the entity labels (specimen ids) followed immediately by the SNP data for the first locus.
It reads the presence/absence data into a matrix of 1s and 0s, and inputs the locus metadata and specimen metadata. The locus metadata comprises a series of columns of values for each locus including the essential columns of CloneID and the desirable variables Reproducibility and PIC. Refer to documentation provide by DArT for an explanation of these columns.
The specimen metadata provides the opportunity to reassign specimens to populations, and to add other data relevant to the specimen. The key variables are id (specimen identity which must be the same and in the same order as the SilicoDArT file, each unique), pop (population assignment), lat (latitude, optional) and lon (longitude, optional). id, pop, lat, lon are the column headers in the csv file. Other optional columns can be added.
The data matrix, locus names (forced to be unique), locus metadata, specimen names, specimen metadata are combined into a genind object. Refer to the documentation for {adegenet} for further details.
Author
Custodian: Bernd Gruber – Post to https://groups.google.com/d/forum/dartr
Examples
silicodartfile <- system.file('extdata','testset_SilicoDArT.csv', package='dartR')
metadata <- system.file('extdata',ind.metafile ='testset_metadata_silicodart.csv', package='dartR')
testset.gs <- gl.read.silicodart(filename = silicodartfile, ind.metafile = metadata)
#> Starting gl.read.silicodart
#> Reading data from file: /home/runner/work/_temp/Library/dartR/extdata/testset_SilicoDArT.csv
#> This may take some time, please wait!
#> Topskip not provided. Guessing topskip...
#> Set topskip to 5 . Proceeding ...
#> Added the following locus metrics:
#> CloneID AlleleSequence TrimmedSequence Chrom_Turtle_v4 ChromPos_Turtle_v4 AlnCnt_Turtle_v4 AlnEvalue_Turtle_v4 CallRate OneRatio PIC AvgReadDepth StDevReadDepth Qpmr Reproducibility .
#> Recognised: 218 individuals and 255 Sequence Tags using /home/runner/work/_temp/Library/dartR/extdata/testset_SilicoDArT.csv
#> Starting conversion to a genlight object ....
#> Please note conversion of bigger data sets will take some time!
#> Once finished, we recommend you save the object using gl.save(object, file="object.rdata")
#> Adding individual metadata: /home/runner/work/_temp/Library/dartR/extdata/testset_metadata_silicodart.csv .
#> Ids for individual metadata (at least a subset of) are matching!
#> Found 218 matching ids out of 218 ids provided in the ind.metadata file. Subsetting loci now!.
#> Added pop factor.
#> Added latlon data.
#> Added id to the other$ind.metrics slot.
#> Added pop to the other$ind.metrics slot.
#> Added lat to the other$ind.metrics slot.
#> Added lon to the other$ind.metrics slot.
#> Added sex to the other$ind.metrics slot.
#> Added maturity to the other$ind.metrics slot.
#> Genlight object created to hold Tag P/A data
#> Completed: gl.read.silicodart
#>