De novo Assembly of Bacteria Genome Nanopore data


Project

Testing_case


Bioinformatics workflow


Nanopore sequencing data QC Report

QC detail

Summary

Mean read length 17,986
Mean read quality 13
Median read length 13,805
Median read quality 13
Number of reads 50,153
Read length N50 20,435
Total bases 902,042,284
>Q5_reads 50,153
>Q5_bases 902.0Mb
>Q7_reads 50,153
>Q7_bases 902.0Mb
>Q10_reads 50,153
>Q10_bases 902.0Mb
>Q12_reads 50,153
>Q12_bases 902.0Mb
>Q15_reads 1,704
>Q15_bases 26.3Mb

More detail could be found in NanoPlot_QC/ folder.

LengthvsQualityScatterPlot_dot

LogTransformed_HistogramReadlength


Assembly Results

The assembled contigs fasta is dfast/genome.fna. genome.fna

Statistics

assembly result detail
# contigs (>= 0 bp) 3
# contigs (>= 1000 bp) 3
# contigs (>= 5000 bp) 3
# contigs (>= 10000 bp) 3
# contigs (>= 25000 bp) 3
# contigs (>= 50000 bp) 3
Total length (>= 0 bp) 5,683,314
Total length (>= 1000 bp) 5,683,314
Total length (>= 5000 bp) 5,683,314
Total length (>= 10000 bp) 5,683,314
Total length (>= 25000 bp) 5,683,314
Total length (>= 50000 bp) 5,683,314
# contigs 3
Largest contig 5,251,427
Total length 5,683,314
GC (%) 35
N50 5,251,427
N75 5,251,427
L50 1
L75 1
# N’s per 100 kbp 0

GC content

Assembly completeness

Compelte_BUSCOs 94.3%
Fragemented_BUSCOs 4.8%
Missing_BUSCOs 0.9%

ORF/ncRNA Prediction and Annotation

The annotation detail result is in dfast/.

Total Sequence Length (bp) 5,683,314
Number of Sequences 3
Longest Sequences (bp) 5,251,427
N50 (bp) 5,251,427
Gap Ratio (%) 0
GCcontent (%) 35
Number of CDSs 6,354
Average Protein Length 242
Coding Ratio (%) 81
Number of rRNAs 42
Number of tRNAs 106
Number of CRISPRs 0

Assembly summary for each contig

The assembled contigs fasta is dfast/genome.fna. genome.fna

Contig_name Genome_Size GC_content CDS tRNA rRNA CRISPRs
sequence1 5,251,427 35.60 5,834 106 42 0
sequence2 349,397 32.67 390 0 0 0
sequence3 82,490 32.22 130 0 0 0

Softwares