Tuesday, August 9, 2016
Georgia Ballroom (Sheraton Hotel Atlanta)
Genotyping-by-sequencing (GBS) is a low-cost approach for characterizing genomes that can be used to assign individuals to populations and identify the genetic basis of phenotypic traits. Although a GBS work-flow (i.e., UNEAK) can be used to genotype species that lack a reference genome, GBS was initially designed to be used with a set of assembled reference sequences. Taro (Colocasia esculenta) lacks a reference genome, yet we have available genomic data from reduced representation sequencing: GBS of Hawaiian taro cultivars; GBS of two parents and their 92 progeny; Restriction-Associated DNA sequencing (RAD-seq) of the two parents; and publicly available taro transcriptome sequences (RNA-seq). In this study, we investigate methods to improve GBS results by creating pseudo-reference genomes for GBS read mapping. These pseudo-references were constructed using 1) only GBS sequences, 2) GBS and RAD-seq data, or 3) RNA-seq data alone. We found at least a two-fold increase in the number of single nucleotide polymorphic sites (SNPs) using any of our approaches versus the UNEAK GBS pipeline. We could also find associations between the genotypes of Hawaiian taro and their Hawaiian nomenclature, which often describe morphological features. Our method of using a pseudo-reference genome we expect will improve all downstream analyses, including linkage map generation and trait associations. This approach may be valuable to specialty crop researchers who want to revisit existing GBS data for species lacking a reference genome, and potentially to identify a better distribution of SNP sites across the genome.