Skip to contents

DaRT provide the data as a matrix of entities (individual animals) across the top and attributes (P/A of sequenced fragment) down the side in a format that is unique to DArT. This program reads the data in to adegenet format for consistency with other programming activity. The script may require modification as DArT modify their data formats from time to time.

Usage

gl.read.silicodart(
  filename,
  ind.metafile = NULL,
  nas = "-",
  topskip = NULL,
  lastmetric = "Reproducibility",
  probar = TRUE,
  verbose = NULL
)

Arguments

filename

Name of csv file containing the SilicoDArT data [required].

ind.metafile

Name of csv file containing metadata assigned to each entity (individual) [default NULL].

nas

Missing data character [default '-'].

topskip

Number of rows to skip before the header row (containing the specimen identities) [optional].

lastmetric

Specifies the last non genetic column (Default is 'Reproducibility'). Be sure to check if that is true, otherwise the number of individuals will not match. You can also specify the last column by a number [default "Reproducibility"].

probar

Show progress bar [default TRUE].

verbose

Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2, or as set by gl.set.verbose()].

Value

An object of class genlight with ploidy set to 1, containing the presence/absence data, and locus and individual metadata.

Details

gl.read.silicodart() opens the data file (csv comma delimited) and skips the first n=topskip lines. The script assumes that the next line contains the entity labels (specimen ids) followed immediately by the SNP data for the first locus.

It reads the presence/absence data into a matrix of 1s and 0s, and inputs the locus metadata and specimen metadata. The locus metadata comprises a series of columns of values for each locus including the essential columns of CloneID and the desirable variables Reproducibility and PIC. Refer to documentation provide by DArT for an explanation of these columns.

The specimen metadata provides the opportunity to reassign specimens to populations, and to add other data relevant to the specimen. The key variables are id (specimen identity which must be the same and in the same order as the SilicoDArT file, each unique), pop (population assignment), lat (latitude, optional) and lon (longitude, optional). id, pop, lat, lon are the column headers in the csv file. Other optional columns can be added.

The data matrix, locus names (forced to be unique), locus metadata, specimen names, specimen metadata are combined into a genind object. Refer to the documentation for {adegenet} for further details.

See also

Author

Custodian: Bernd Gruber -- Post to https://groups.google.com/d/forum/dartr

Examples

silicodartfile <- system.file('extdata','testset_SilicoDArT.csv', package='dartR')
metadata <- system.file('extdata',ind.metafile ='testset_metadata_silicodart.csv', package='dartR')
testset.gs <- gl.read.silicodart(filename = silicodartfile, ind.metafile = metadata)
#> Starting gl.read.silicodart 
#>   Reading data from file: /home/runner/work/_temp/Library/dartR/extdata/testset_SilicoDArT.csv 
#>     This may take some time, please wait!
#>   Topskip not provided. Guessing topskip...
#>   Set topskip to  5 . Proceeding ...
#>     Added the following locus metrics:
#> CloneID AlleleSequence TrimmedSequence Chrom_Turtle_v4 ChromPos_Turtle_v4 AlnCnt_Turtle_v4 AlnEvalue_Turtle_v4 CallRate OneRatio PIC AvgReadDepth StDevReadDepth Qpmr Reproducibility .
#>   Recognised: 218 individuals and 255 Sequence Tags using /home/runner/work/_temp/Library/dartR/extdata/testset_SilicoDArT.csv 
#>   Starting conversion to a genlight object ....
#>     Please note conversion of bigger data sets will take some time!
#>     Once finished, we recommend you save the object using gl.save(object, file="object.rdata")
#>   Adding individual metadata: /home/runner/work/_temp/Library/dartR/extdata/testset_metadata_silicodart.csv .
#>   Ids for individual metadata (at least a subset of) are matching!
#>   Found  218 matching ids out of 218 ids provided in the ind.metadata file. Subsetting loci now!.
#>      Added pop factor.
#>     Added latlon data.
#>     Added  id  to the other$ind.metrics slot.
#>      Added  pop  to the other$ind.metrics slot.
#>      Added  lat  to the other$ind.metrics slot.
#>      Added  lon  to the other$ind.metrics slot.
#>      Added  sex  to the other$ind.metrics slot.
#>      Added  maturity  to the other$ind.metrics slot.
#>   Genlight object created to hold Tag P/A data
#> Completed: gl.read.silicodart 
#>