ALN
#
Find similar titles
- (rev. 5)
- ㅎnㅂrㄹrㄱi
Structured data
- Category
- Analysis
ALN파일 포멧은 Clustal을 통해 나오는 output 파일 포멧이다.
Clustal는 여러 개의 핵산이나 단백질 서열의 유사성을 비교하는 다중서열정렬(Multiple Sequence Alignment)을 해주는 툴 중 하나이며 ClustalW와 ClustalX가 있으며 W는 Command line버젼이며 X는 윈도우 용이다.
아래는 ClustalW를 실행하기 위한 FASTA 포멧이다.
>query
MKNTLLKLGVCVSLLGITPFVSTISSVQAERTVEHKVIKNETGTISISQLNKNVW
VHTELGYFSGEAVPS NGLVLNTSKGLVLVDSSWDDKLTKELIEMVEKKFKKRV
TDVIITHAHADRIGGMKTLKERGIKAHSTALT AELAKKNGYEEPLGDLQSVTNLK
FGNMKVETFYPGKGHTEDNIVVWLPQYQILAGGCLVKSASSKDLGNVADAYV
NEWSTSIENVLKRYGNINLVVPGHGEVGDRGLLLHTLDLLK
>gi|2984094
MGGFLFFFLLVLFSFSSEYPKHVKETLRKITDRIYGVFGVYEQVSYENRGFISNAY
FYVADDGVLVVDALSTYKLGKELIESIRSVTNKPIRFLVVTHYHTDHFYGAKAFR
EVGAEVIAHEWAFDYISQPSSYNFFLARKKILKEHLEGTELTPPTITLTKNLNVYLQ
VGKEYKRFEVLHLCRAHTNGDIVVWIPDEKVLFSGDIVFDGRLPFLGSGNSRTWL
VCLDEILKMKPRILLPGHGEALIGEKKIKEAVSWTRKYIKDLRETIRKLYEEGCDVE
CVRERINEELIKIDPSYAQVPVFFNVNPVNAYYVYFEIENEILMGE
>gi|115023|sp|P10425|
MKKNTLLKVGLCVSLLGTTQFVSTISSVQASQKVEQIVIKNETGTISISQLNKNVW
VHTELGYFNGEAVPSNGLVLNTSKGLVLVDSSWDNKLTKELIEMVEKKFQKRVTD
VIITHAHADRIGGITALKERGIKAHSTALTAELAKKSGYEEPLGDLQTVTNLKFGNTK
VETFYPGKGHTEDNIVVWLPQYQILAGGCLVKSAEAKNLGNVADAYVNEWSTSIE
NMLKRYRNINLVVPGHGKVGDKGLLLHTLDLLK
>gi|115030|sp|P25910|
MKTVFILISMLFPVAVMAQKSVKISDDISITQLSDKVYTYVSLAEIEGWGMVPSNGM
IVINNHQAALLDTPINDAQTEMLVNWVTDSLHAKVTTFIPNHWHGDCIGGLGYLQR
KGVQSYANQMTIDLAKEKGLPVPEHGFTDSLTVSLDGMPLQCYYLGGGHATDNIV
VWLPTENILFGGCMLKDNQATSIGNISDADVTAWPKTLDKVKAKFPSARYVVPGH
GDYGGTELIEHTKQIVNQYIESTSKP
>gi|282554|pir||S25844
MTVEVREVAEGVYAYEQAPGGWCVSNAGIVVGGDGALVVDTLSTIPRARRLAEWV
DKLAAGPGRTVVNTH FHGDHAFGNQVFAPGTRIIAHEDMRSAMVTTGLALTGLWP
RVDWGEIELRPPNVTFRDRLTLHVGERQVE LICVGPAHTDHDVVVWLPEERVLFAGD
VVMSGVTPFALFGSVAGTLAALDRLAELEPEVVVGGHGPVAGP EVIDANRDYLRWV
QRLAADAVDRRLTPLQAARRADLGAFAGLLDAERLVANLHRAHEELLGGHVRDAM
EI FAELVAYNGGQLPTCLA
- 위 쿼리를 input으로 사용하고 clustalW를 실행하면 아래와 같은 aln포멧의 결과를 반환한다.
- 첫번째 라인에는 "CLUSTAL W" 혹은 "CLUSTALW"로 명시되어진다.
- 그다음은 시퀀스명 , 시퀀스 , 시퀀스 길이정보 한 라인에 표기된다.
-
아래는 식별자에 대한 설명이다.
- * = this column of the alignment contains identical amino acid residues in all sequences (or identical bases if DNA sequences are aligned)
- : = this column of the alignment contains different but highly conserved (very similar) amino acids
- . = this column of the alignment contains different amino acids that are somewhat similar
- blank = this column of the alignment contains dissimilar amino acids or gaps (or different bases if DNA sequences are aligned)
- -- no match.
Incoming Links #
Related Bioinformaticses (Bioinformatics 0) #
Suggested Pages #
- 0.175 Hot topics
- 0.025 IUPAC명명법
- 0.025 sequence
- 0.025 Genbank
- 0.025 protein
- 0.025 DNA
- 0.025 NCBI
- 0.025 Sequence
- 0.025 Alignment
- 0.013 Needleman-Wunsch 알고리즘
- More suggestions...