Skip to content

CAP3 #
Find similar titles

Structured data

Category
Software

Introcuction #

CAP3는 DNA sequence를 de-novo assembly하는 프로그램의 일종이다. De-novo assembly의 대표적인 알고리즘인 DBG(De Bruijn graph assembly)와 OLC(Overlap Layout Consensus assembly 중 OLC를 대표하는 프로그램으로 MIRA, PHAST와 함께 전세계적으로 알려져있다. 특히, CAP3는 3번째 버전으로 업데이트 되면서 5'과 3'의 poor regions을 제거하는 기능이 추가되어 더욱 효과적인 어셈블리를 기대할 수 있다.

Process #

  1. Removal of poor end regions of reads
  2. Computation of overlaps between reads
  3. Removal of false overlaps
  4. Construction of contigs
  5. Construction of multiple sequence alignments and generation of consensus sequences

Usage #

CAP3 [reads].fasta

CAP3의 사용법은 매우 간단한 편으로 fasta 포맷의 파일을 넣어주면된다.

Options #

  • -a N specify band expansion size N > 10 (20)
  • -b N specify base quality cutoff for differences N > 15 (20)
  • -c N specify base quality cutoff for clipping N > 5 (12)
  • -d N specify max qscore sum at differences N > 100 (200)
  • -e N specify extra number of differences N > 10 (20)
  • -f N specify max gap length in any overlap N > 10 (300)
  • -g N specify gap penalty factor N > 0 (6)
  • -h N specify max overhang percent length N > 5 (20)
  • -i N specify segment pair score cutoff N > 20 (40)
  • -j N specify chain score cutoff N > 30 (80)
  • -k N specify end clipping flag N >= 0 (1)
  • -m N specify match score factor N > 0 (2)
  • -n N specify mismatch score factor N < 0 (-5)
  • -o N specify overlap length cutoff > 15 (40)
  • -p N specify overlap percent identity cutoff N > 65 (90)
  • -r N specify reverse orientation value N >= 0 (1)
  • -s N specify overlap similarity score cutoff N > 250 (900)
  • -t N specify max number of word occurrences N > 30 (500)
  • -u N specify min number of constraints for correction N > 0 (4)
  • -v N specify min number of constraints for linking N > 0 (2)
  • -w N specify file name for clipping information (none)
  • -x N specify prefix string for output file names (cap)
  • -y N specify clipping range N > 5 (100)
  • -z N specify min no. of good reads at clip pos N > 0 (2)

Reference #

Huang, X. and Madan, A. (1999) CAP3: A DNA Sequence Assembly Program, Genome Research, 9: 868-877. http://genome.cshlp.org/content/9/9/868.full

Source #

http://seq.cs.iastate.edu/cap3.html

Incoming Links #

Related Bioinformaticses #

0.0.1_20140628_0