MAKER
#
Find similar titles
- (rev. 3)
- green
Structured data
- Category
- Software
Genome annotation의 유전자 구조를 예측하는 MAKER #
Genome 시퀀싱 이후 유전체 어셈블리와 유전체 구조 분석 및 단백질 기능 분석은 기본적인 생물종의 정보를 확인하는 방법이다. 이중 유전체 구조 분석은 mRNA 서열 혹은 단백질 서열을 유전체 서열에 mapping을 통해 진행되는 것이 일반적이다. 그 대표적인 프로그램으로 MAKER를 들 수 있다
MAKER의 주요 기능 #
- RepeatMasker를 통한 repeat elements 분석
- ESTs 서열 mapping을 통한 유전자 모델링 (BLASTN, Exonerate)
- 단백질 서열 mapping을 통한 유전자 모델링 (BLASTX, Genewise)
- Ab initio 프로그램을 통한 유전자 모델 예측 (SNAP, Augustus, GeneMark-ES, Fgenesh)
- 여러 유전자 모델 정보를 통한 cosensus gene model 예측
설치요구사항 #
Perl Modules
BioPerl
DBI
Error
Error::Simple
File::NFSLock
File::Which
Inline
Perl::Unsafe::Signals
Proc::Signal
URI::Escape
Bit::Vector
Inline::C
PerlIO::gzip
IO::All(Optional, for accessory scripts)
IO::Prompt(Optional, for accessory scripts)
forks(Optional, for MPI scripts)
forks::shared(Optional, for MPI scripts)
External Programs
Perl 5.8.0 or Higher
SNAP version 2009-02-03 or higher
RepeatMasker 3.1.6 or higher
Exonerate 1.4 or higher
NCBI BLAST 2.2.X or higher
Genewise 2.2.0
Optional Components:
Augustus 2.0 or higher
GeneMark-ES 2.3a or higher
FGENESH 2.6 or higher
Required for optional MPI support:
MPICH2
분석 방법 (진핵생물) #
> maker -f -base [outhandle] -cpus 20 maker_opts.ctl >& maker_opts.ctl.log
> cat maker_opts.ctl
#-----Genome (Required for De-Novo Annotation)
genome=my_genome.fasta #genome sequence file in fasta format
organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic
#-----EST Evidence (for best results provide a file for at least one)
est= #non-redundant set of assembled ESTs in fasta format (classic EST analysis)
est_reads= #unassembled nextgen mRNASeq in fasta format (not fully implemented)
altest= #EST/cDNA sequence file in fasta format from an alternate organism
est_gff= #EST evidence from an external gff3 file
altest_gff=rnaseq_transcripts.gff3 #Alternate organism EST evidence from a separate gff3 file
#-----Protein Homology Evidence (for best results provide a file for at least one)
protein=ref1_protein.fasta,ref2_protein.fasta #protein sequence file in fasta format
protein_gff= #protein homology evidence from an external gff3 file
#-----Repeat Masking (leave values blank to skip repeat masking)
model_org= #select a model organism for RepBase masking in RepeatMasker
rmlib=my_genome_repeat.fasta #provide an organism specific repeat library in fasta format for RepeatMasker
repeat_protein= #provide a fasta file of transposable element proteins for RepeatRunner
rm_gff= #repeat elements from an external GFF3 file
prok_rm=0 #forces MAKER to run repeat masking on prokaryotes (don't change this), 1 = yes, 0 = no
#-----Gene Prediction
snaphmm= #SNAP HMM file
gmhmm= #GeneMark HMM file
augustus_species=fly #Augustus gene prediction species model
fgenesh_par_file= #Fgenesh parameter file
pred_gff= #ab-initio predictions from an external GFF3 file
model_gff= #annotated gene models from an external GFF3 file (annotation pass-through)
est2genome=0 #infer gene predictions directly from ESTs, 1 = yes, 0 = no
protein2genome=0 #gene prediction from protein homology (prokaryotes only), 1 = yes, 0 = no
unmask=0 #Also run ab-initio prediction programs on unmasked sequence, 1 = yes, 0 = no