Library Prep and Sequencing
There are three basic options for CCGP projects to generate whole genome resequencing libraries:
1. Extract DNA and generate your own libraries (click here for more details)
2. UC core facility extracts DNA and/or generates libraries for you (Click here for more details)
3. CCGP Mini-Core extracts DNA and/or generates libraries for you (Click here for details)
Sequencing must be performed at one of the University of California core labs. The following are familiar with the scope and goals of CCGP projects, but other UC sequencing cores are also acceptable:
The DNA Technologies Core at UC Davis (link to site)
QB3 Genomics Core at UC Berkeley (LINK to site)
If you are planning to submit material to QB3, please add your information to this spreadsheet and email a 2-weeks notice to the facilities manager, Carrianne Miller, before sending.
CCGP Mini-Core
If samples were submitted for DNA extraction and/or library prep, then libraries will be pooled and sent to sequencing for you
Library Prep and Sequencing:
General Guidelines
Number of Samples per Species
Please try to adhere to your funded sampling plan/excel spreadsheet
Sampling schemes should maximize geographic coverage over the California distribution of the target taxa
This should include ≤150 samples per species/genus
Because we are focused on geographic breadth, population sampling is discouraged
Sequencing Coverage
Whole genome resequencing should aim for 10x sequencing depth, after adjusting for duplicate reads and organelle presence, with a likely outcome of between 8-12x after data filtering
duplicate reads may range from 10-20%, depending on the library prep type
organelles may account for up to 2% of DNA yield depending on species and tissue type
One S4 NovaSeq lane (150bp paired end reads) yields 600-750 Gigabases/ 2-2.5 billion reads (PE). These are the specifications provided by Illumina; your sequencing core may achieve higher numbers (often up to 800-900 GB)
Consult with a bioinformatician to confirm that you are targeting an appropriate level of coverage
Data Sharing
Data sharing and submission to ccgp
All resequencing data generated as part of a CCGP award must be shared with CCGP and also be made publicly available.
To share WGS data, you must share the data download links provided by the UC core sequencing lab with data wrangler Ryan Pontius. These expire 1 month after data delivery so make sure that you are prompt on sending this link. Failure to do so will result in difficulty and potentially an additional cost for sharing the data with the bioinformatics team. Please contact Ryan Pontius if you have any questions.
Sharing this data will not interfere with PI(s) ability to access or use this data.
Submitting your data to CCGP
Data generated by the CCGP Mini-Core will automatically be captured.
For projects that do not utilize the Mini-Core, please see our WGS data ingest page.
WGS Data Processing
The CCGP Bioinformatics team is responsible for WGS data intake, processing, and delivery. The team will organize the raw sequencing reads and associated metadata, perform QC and map the reads to the reference genome to call variants. The variant call format (VCF) file will then be distributed to the project PIs for their own use as well as the CCGP Landscape Genomics team for downstream meta-analyses.