CAP3
#
Find similar titles
- (rev. 4)
- javis
Structured data
- Category
- Software
Table of Contents
Introcuction #
CAP3는 DNA sequence를 de-novo assembly하는 프로그램의 일종이다. De-novo assembly의 대표적인 알고리즘인 DBG(De Bruijn graph assembly)와 OLC(Overlap Layout Consensus assembly 중 OLC를 대표하는 프로그램으로 MIRA, PHAST와 함께 전세계적으로 알려져있다. 특히, CAP3는 3번째 버전으로 업데이트 되면서 5'과 3'의 poor regions을 제거하는 기능이 추가되어 더욱 효과적인 어셈블리를 기대할 수 있다.
Process #
- Removal of poor end regions of reads
- Computation of overlaps between reads
- Removal of false overlaps
- Construction of contigs
- Construction of multiple sequence alignments and generation of consensus sequences
Usage #
CAP3 [reads].fasta
CAP3의 사용법은 매우 간단한 편으로 fasta 포맷의 파일을 넣어주면된다.
Options #
- -a N specify band expansion size N > 10 (20)
- -b N specify base quality cutoff for differences N > 15 (20)
- -c N specify base quality cutoff for clipping N > 5 (12)
- -d N specify max qscore sum at differences N > 100 (200)
- -e N specify extra number of differences N > 10 (20)
- -f N specify max gap length in any overlap N > 10 (300)
- -g N specify gap penalty factor N > 0 (6)
- -h N specify max overhang percent length N > 5 (20)
- -i N specify segment pair score cutoff N > 20 (40)
- -j N specify chain score cutoff N > 30 (80)
- -k N specify end clipping flag N >= 0 (1)
- -m N specify match score factor N > 0 (2)
- -n N specify mismatch score factor N < 0 (-5)
- -o N specify overlap length cutoff > 15 (40)
- -p N specify overlap percent identity cutoff N > 65 (90)
- -r N specify reverse orientation value N >= 0 (1)
- -s N specify overlap similarity score cutoff N > 250 (900)
- -t N specify max number of word occurrences N > 30 (500)
- -u N specify min number of constraints for correction N > 0 (4)
- -v N specify min number of constraints for linking N > 0 (2)
- -w N specify file name for clipping information (none)
- -x N specify prefix string for output file names (cap)
- -y N specify clipping range N > 5 (100)
- -z N specify min no. of good reads at clip pos N > 0 (2)
Reference #
Huang, X. and Madan, A. (1999) CAP3: A DNA Sequence Assembly Program, Genome Research, 9: 868-877. http://genome.cshlp.org/content/9/9/868.full
Source #
http://seq.cs.iastate.edu/cap3.html