Copy the sequence retrieved from previous step of tutorial. If you don’t have the sequence
for Swiss Prot ID P00374 or directly go this link. http://ca.expasy.org/uniprot/P00374.fas
The above sequence is in FASTA format. This is a very simple format. The first line starts
with a “>” sign and then one line description of the protein. Next line will have the actual sequence.
The most frequently used site to perform BLAST is NCBI. You can check out the names
of other sites from Tools section.
Click here to visit NCBI’s Blast. http://www.ncbi.nlm.nih.gov/BLAST/
BLAST page opens. It shows number of different blast programs available on this page.Each program performs a different task.
blastp compares an amino acid query sequence against a protein sequence database
blastn compares a nucleotide query sequence against a nucleotide sequence database
blastx compares a nucleotide query sequence translated in all reading frames against
a protein sequence database
tblastn compares a protein query sequence against a nucleotide sequence database
dynamically translated in all reading frames
tblastx compares the six-frame translations of a nucleotide query sequence against
the six-frame translations of a nucleotide sequence database
I will focus on Protein-Protein Blast as we have protein sequence with us and it compares
amino acid query sequence with the protein sequence databases.
Main blast page appears. Follow these steps to perform simple blast search.
First box which appears by the name of the search is actual box where you have
to paste your sequence.
(Paste the above protein sequence in FASTA format)
Now you can run the blast with the default parameters BUT your real job starts
here as the parameters which are given below is the trick of the game. I’ll try to explain the basic parameters which
one might feel a need to change.
I’ll only focus on some important or specific parameters.
Choose database - This option tells you that against which database you want to search for your protein .Like the
default parameter is nr database i.e., Non Redundant database. It contains
All GenBank+EMBL+DDBJ+PDB sequences.
This is commonly used parameter. You can change this option suppose if one might
need to search against a
Coming to options for advanced blasting. As
the name suggests all the parameters are meant for advanced use.
Limit by entrez query -If you want to limit your search against only a particular
organism you can do by this option. e.g.. Only against rat or cow genome.
Composition based statistics –it’s a advanced procedure which is by default
used in PSI –BLAST ,we’ll accept
the default parameter for protein-protein BLAST.
Choose filter -It actually masks the low complexity region or in simpler term those
regions which are not biologically
interesting. Leave the default values or change according to your need.
Expect -Ok now this is important.10 is the default parameter and this means that
these number of matches that
results might arise by chance. If you are lowering the number you are actually confining yourself to very fewer
matches which are arising by chance but definitely comprising on speed. I guess 10 is a pretty ok choice
for normal BLAST runs.
Word size -As Blast cuts your query sequence into words of specific lengths and
3 is the default parameter in NCBI BLAST
and unfortunately you can only change it to 2.If you want to play more with this option go to EBI BLAST.
Matrix and Gap Costs :-Blast uses substitution matrices for scoring an alignment.There are basically two types of broadly
used matrices. PAM and BLOSUM. To study matrices in greater detail Click here- http://www.ncbi.nlm.nih.gov/blast/html/sub_matrix.html.
Gap costs is the score which is used to improve the alignment score.
Leave the rest parameters to default .Then click on blast at the extreme down
of this page to run blast. TOP