A guide to sequence similarity searching using BLAST and other tools.

The NCBI has a detailed description of all the statistical properties of BLAST scores.

NCBI's Statistics of Sequence Similarity Scores

Outlined in this guide are the statistics of the BLAST report that are useful to know so that you can evaluate the BLAST hits that result from a search.

**Expectation value.** The likelihood that the alignment has a score equivalent to or better than the BLAST-calculated raw score S that is expected to occur in a database search by chance. The lower the E value, the more significant the score.

**Raw Score. **The score of an alignment, S, calculated as the sum of substitution and gap scores.

Substitution scores are given by a look-up table (like PAM, BLOSUM). Gap scores are typically calculated as the sum of G, the gap opening penalty and L, the gap extension penalty. For a gap of length n, the gap cost would be G+Ln. The choice of gap costs, G and L is empirical, but it is customary to choose a high value for G (10-15) and a low value for L (1-2).

- Last Updated: Apr 13, 2022 10:38 AM
- URL: https://libguides.galter.northwestern.edu/sequence-similarity
- Print Page