Performing BLAST Tutorial 

BLAST stands for Basic Local Alignment Search Tool. BLAST compares the query sequence with the database sequence.

I have divided this tutorial into 2 parts:-

    • Performing BLAST
    • Inferring BLAST Results.

Copy the sequence retrieved from previous step of tutorial. If you don’t have the sequence for Swiss Prot ID P00374 or directly go this link.

The above sequence is in FASTA format. This is a very simple format. The first line starts with a “>” sign and then one line description of the protein. Next line will have the actual sequence.

The most frequently used site to perform BLAST is NCBI. You can check out the names of other sites from Tools section.

Click here to visit NCBI’s  Blast.

BLAST page opens. It shows number of different blast programs available on this page.Each program performs a different task.

blastp compares an amino acid query sequence against a protein sequence database
blastn compares a nucleotide query sequence against a nucleotide sequence database
blastx compares a nucleotide query sequence translated in all reading frames against a protein sequence database
tblastn compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames
compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database

I will focus on Protein-Protein Blast as we have protein sequence with us and it compares amino acid query sequence with the protein sequence databases.


Main blast page appears. Follow these steps to perform simple blast search.

First box which appears by the name of the search is actual box where you have to paste your sequence. (Paste the above protein sequence in FASTA format)

Now you can run the blast with the default parameters BUT your real job starts
here as the parameters which are given below is the trick of the game. I’ll try to explain the basic parameters which one might feel a need to change.

I’ll only focus on some important or specific parameters.

Choose database - This option tells you that against which database you want to search for your protein .Like the default parameter is nr database i.e., Non Redundant database. It contains All GenBank+EMBL+DDBJ+PDB sequences.
This  is commonly used parameter. You  can change this option suppose if one might need to search against a particular database.

Coming to options for advanced blasting. As the name suggests all the parameters are meant for advanced use.

Limit by entrez query -If you want to limit your search against only a particular
organism you can do by this option. e.g.. Only against rat or cow genome.

Composition based statistics –it’s a advanced procedure which is by default
used in PSI –BLAST ,we’ll accept the default parameter for protein-protein BLAST.

Choose filter -It actually masks the low complexity region or in simpler term those
regions which are not biologically interesting. Leave the default values or change according to your need.

Expect -Ok now this is important.10 is the default parameter and this means that these number of matches that results might arise by chance. If you are lowering the number you are actually confining yourself to very fewer matches which are arising by chance but definitely comprising on speed. I guess 10 is a pretty ok choice for normal BLAST runs.

Word size -As Blast cuts your query sequence into words of specific lengths and
3 is the default parameter in NCBI BLAST and unfortunately you can only change it to 2.If you want to play more with this option go to EBI BLAST.

Matrix and Gap Costs :-Blast uses substitution matrices for scoring an alignment.There are basically two types of broadly used matrices. PAM and BLOSUM. To study matrices in greater detail Click here-
Gap costs is the score which is used to improve the alignment score.

Leave the rest parameters to default .Then click on blast at the extreme down
of this page to run blast. TOP