Skip to Main Content

Sequence Similarity Searching

A guide to sequence similarity searching using BLAST and other tools.

Results Format

At the top of the results page is a summary of your query, along with a link to go back to edit your search or save it to your My NCBI account.

In a box on the top right side of the results, you can filter by organism or by minimum percent identity, expect value, or query sequence coverage.

If you performed a PSI-BLAST, you will also have the option to set parameters for the next iteration of the PSI-BLAST.

Just above the main results section are three links (Other reports) to view a distance tree of results, multiple alignment, or a dynamic graphic view in NCBI's MSA Viewer.

The main section of the BLAST results page is organized by tabs

  • List of descriptions of the retrieved protein results with their names and NCBI accession numbers, along with their percent matched scores.

  • There are checkboxes to the left of retrieved protein matches that you can use to select them for download.
    • Hint: click on any description and you will be taken to the alignment between your query sequence and the target match.
  • If you ran a PSI-BLAST, these same checkboxes are automatically selected for the highest scoring hits that will be used to build a position-specific scoring matrix (PSSM) for the next iteration of PSI-BLAST. You can uncheck any of these hits (especially if there are many redundant results).

Additionally, if you have some low similarity matches, there will be a cutoff line in the listed results showing where the expect value (E-value) falls below the threshold.

Note: even if some matches fall below the E-value cutoff, you can still use them for PSI-BLAST iterations or for multiple sequence alignment. This is especially true if you are searching for orthologs to a conserved domain or fold, which may have low sequence identity but high functional or structural similarity.


  • A graphic summary of aligned results.
    • If there are any conserved domains or protein superfamilies in your query or results, the top of the graphical display will show where those domains are in your query sequence.
    • In the graphical alignment, colors indicate the percent identity match (hotter colors = higher identity)

 


  • Alignments of the matches

 


  • Taxonomy
    • The taxonomy tab has three separate tabs/buttons to display results as lineage, organism or taxonomy


 

Reformatting or Exporting BLAST Results - Many Options

On your BLAST results page, you can download the BLAST alignment in many formats (FASTA, Hit table, Text, XML, ASN.1, csv)

If want to run your own multiple sequence alignment from a subset of your results:

  • Click checkboxes next to the results in the Descriptions list that you want to align
  • Click the Download option at the top of the Descriptions list and select your preferred format: FASTA, FASTA aligned, XML, etc.
  • If you want to use a different multiple sequence alignment tool than what is offered at NCBI, choose FASTA (complete sequence)