RESOURCES

Whole Genome Resequencing

Library Prep and Sequencing

There are three basic options for CCGP projects to generate whole genome resequencing libraries:

1. Extract DNA and generate your own libraries (click here for more details)

2. UC core facility extracts DNA and/or generates libraries for you (Click here for more details)

3. CCGP Mini-Core extracts DNA and/or generates libraries for you (Click here for details)

Sequencing must be performed at one of the University of California core labs. The following are familiar with the scope and goals of CCGP projects, but other UC sequencing cores are also acceptable:

The DNA Technologies Core at UC Davis (link to site)
QB3 Genomics Core at UC Berkeley (LINK to site)
- If you are planning to submit material to QB3, please add your information to this spreadsheet and email a 2-weeks notice to the facilities manager, Carrianne Miller, before sending.
CCGP Mini-Core
- If samples were submitted for DNA extraction and/or library prep, then libraries will be pooled and sent to sequencing for you

Library Prep and Sequencing:

General Guidelines

Number of Samples per Species

Please try to adhere to your funded sampling plan/excel spreadsheet
Sampling schemes should maximize geographic coverage over the California distribution of the target taxa
This should include ≤150 samples per species/genus
Because we are focused on geographic breadth, population sampling is discouraged

Sequencing Coverage

Whole genome resequencing should aim for 10x sequencing depth, after adjusting for duplicate reads and organelle presence, with a likely outcome of between 8-12x after data filtering
- duplicate reads may range from 10-20%, depending on the library prep type
- organelles may account for up to 2% of DNA yield depending on species and tissue type
One S4 NovaSeq lane (150bp paired end reads) yields 600-750 Gigabases/ 2-2.5 billion reads (PE). These are the specifications provided by Illumina; your sequencing core may achieve higher numbers (often up to 800-900 GB)
Consult with a bioinformatician to confirm that you are targeting an appropriate level of coverage

Data Sharing

Data sharing and submission to ccgp

All resequencing data generated as part of a CCGP award must be shared with CCGP and also be made publicly available.
To share WGS data, you must share the data download links provided by the UC core sequencing lab with data wrangler Ryan Pontius. These expire 1 month after data delivery so make sure that you are prompt on sending this link. Failure to do so will result in difficulty and potentially an additional cost for sharing the data with the bioinformatics team. Please contact Ryan Pontius if you have any questions.
Sharing this data will not interfere with PI(s) ability to access or use this data.

Submitting your data to CCGP

Data generated by the CCGP Mini-Core will automatically be captured.
For projects that do not utilize the Mini-Core, please see our WGS data ingest page.

WGS Data Processing

The CCGP Bioinformatics team is responsible for WGS data intake, processing, and delivery. The team will organize the raw sequencing reads and associated metadata, perform QC and map the reads to the reference genome to call variants. The variant call format (VCF) file will then be distributed to the project PIs for their own use as well as the CCGP Landscape Genomics team for downstream meta-analyses.

Click here to see the poster that the Bioinformatics team presented on their variant calling pipeline!