top of page

FASTA (Sequence Format)

You're currently learning a lecture from the course: 

... 

Prerequisite Terminologies

In order to have thorough understanding of the main topic, you should have the basic concept of the following terms:

Accession Number
Sequence Retrieval using NCBI

Duration: 

Transcription

By:

Amman Safeer

FASTA (Format)

FASTA is text-based format file containing Biological Sequence that is

used to organize, sequence and store the Biological Data. It is one of the

simplest and widely used format in the Bioinformatics. The Biological

Sequence can be either in the form of nucleotide or amino acids in

which nucleotides or amino acids are represented using single-letter

codes. The format also allows for sequence names and comments to

precede the sequences. The first line of the format consists of the

description of the sequence and the second line initiates with the

sequence.

Syntax

The basic syntax of the typical FASTA is as:

>Description of the sequence………………………………….

Sequence_____________________________________]

_____________________________________________]

_____________________________________________]

_____________________________________________]

_____________________________________________]

  • The Description always starts from the ‘>’ sign and usually consists of the Accession Number and the name of the specie of which the sequence is.

  • The Sequence is based on the single-letter code denoting either nucleotides or amino acids that have been standardized by IUB/IUPAC.

  • Each row consists of 70 to 80 letters, each letter represents the corresponding nucleotide or amino acid.


Extension

Like all other formats FASTA also has its own filename extensions in

which it is stored, each extension denotes specific type of sequence

which are given as:


Extension Acronym

fasta                                                             generic fasta

fna                                                                fasta nucleic acid

ffn                                                                 FASTA nucleotide of gene regions

faa                                                                fasta amino acid

frn                                                                 FASTA non-coding RNA


Summary

In this tutorial, we’ve learnt about the FASTA format that it is most simple and widely used text-based format and what is the syntax of the FASTA and extensions of FASTA. We have used the sequence of mRNA of Homo sapiens Lactase (LCT), you can also retrieve the sequence of your requirement4

File(s) Section

If a particular file is required for this video, and was discussed in the lecture, you can download it by clicking the button below.

Useful Data
bottom of page