FASTA FASTA

FASTA - Definition and Overview

FASTA is a sequence alignment package first described (as FASTP) by David J. Lipman and William R. Pearson in 1985 in the article "Rapid and sensitive protein similarity searches". There are several programs in this package that allow the alignment of protein sequences and DNA sequences.

A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of text be shorter than 80 characters in length. An example sequence in FASTA format is:

>gi|532319|pir|TVFV2E|TVFV2E envelope protein ELRLRYCAPAGFALLKCNDADYDGFKTNCSNVSVVHCTNLMNTTVTTGLLLNGSYSENRT QIWQKHRTSNDSALILLNKHYNLTVTCKRPGNKTVLPVTIMAGLVFHSQKYNLRLRQAWC HFPSNWKGAWKEVKEEIVNLPKERYRGTNDPKRIFFQRQWGDPETANLWFNCHGEFFYCK MDWFLNYLNNLTVDADHNECKNTSGTKSGNKRAPGPCVQRTYVACHIRSVIIWLETISKK TYAPPREGHLECTSTVTGMTVELNYIPKNRTNVTLSPQIESIWAAELDRYKLVEITPIGF APTEVRRYTGGHERQKRVPFVXXXXXXXXXXXXXXXXXXXXXXVQSQHLLAGILQQQKNL LAAVEAQQQMLKLTIWGVK Sequences are expected to be represented in the standard IUB/IUPAC amino acid and nucleic acid codes, with these exceptions: lower-case letters are accepted and are mapped into upper-case; a single hyphen or dash can be used to represent a gap of indeterminate length; and in amino acid sequences, U and * are acceptable letters (see below). Before submitting a request, any numerical digits in the query sequence should either be removed or replaced by appropriate letter codes (e.g., N for unknown nucleic acid residue or X for unknown amino acid residue). The nucleic acid codes supported are:

       A --> adenosine           M --> A C (amino)
       C --> cytidine            S --> G C (strong)
       G --> guanine             W --> A T (weak)
       T --> thymidine           B --> G T C
       U --> uridine             D --> G A T
       R --> G A (purine)        H --> A C T
       Y --> T C (pyrimidine)    V --> G C A
       K --> G T (keto)          N --> A G C T (any)
                                 -  gap of indeterminate length

For those programs that use amino acid query sequences (BLASTP and TBLASTN), the accepted amino acid codes are:

   A  alanine                         P  proline
   B  aspartate or asparagine         Q  glutamine
   C  cystine                         R  arginine
   D  aspartate                       S  serine
   E  glutamate                       T  threonine
   F  phenylalanine                   U  selenocysteine
   G  glycine                         V  valine
   H  histidine                       W  tryptophan
   I  isoleucine                      Y  tyrosine
   K  lysine                          Z  glutamate or glutamine
   L  leucine                         X  any
   M  methionine                      *  translation stop
   N  asparagine                      -  gap of indeterminate length

Relevant link

FASTA Format description (http://www.ncbi.nlm.nih.gov/BLAST/fasta.shtml)

Example Usage of FASTA

Blackanloveli: If mii wan FASTA. He will give me faster
BeysBiggestFan: RT @beyonceislegend: Boy the way you blowing up my phone wont make me leave no FASTA, put my coat on FASTA, leave my girls no FASTA !!
mrszsavage: |iRRiTATED|-Do me a fav. && step ya facts upp. ima jus let yal tlk bc yal lukk meqa stupid!. && yal spread sh.t FASTA than yal spread qerms.
Copyright 2009 WordIQ.com - Privacy Policy  :: Terms of Use  :: Contact Us  :: About Us
This article is licensed under the GNU Free Documentation License. It uses material from the this Wikipedia article.