In this tutorial we carry out de novo assembly of a microbial genome.
This tutorial runs on the GVL Galaxy Tutorial Server. All needed tools are on the server.
Needed datasets exist in Shared Libraries on the server, and are also available via URL.
The data for this tutorial is from a whole genome sequencing experiment of a multi-drug resistant strain of the bacterium Staphylococcus aureus. The DNA was sequenced using an Illumina GAII sequencing machine. The data we are going to use consists of about 4 million x 75 base-pair, paired end reads (two FASTQ read files, one for each end of a DNA fragment.) The data was downloaded from the NCBI Short Read Archive (SRA) (http://www.ncbi.nlm.nih.gov/sra/). The specific sample is a public dataset published in April 2012 with SRA accession number ERR048396.