CLASSEQ
Classification of Multiple Sequences
via Comparative Analysis of Multiple Genomes

CLASSEQ (Classification of Sequences via Comparative Analysis of Multiple Genomes) is a web server where users can submit uncharacterized protein sequences to analyze them in comparison with protein sequences from multiple genomes of user's choice.

  1. The input sequences are classified into clusters of sequences using our sequence clustering algorithm BAG, together with protein sequences from multiple genomes that can be freely selected by users.
  2. The clustering analysis of sequences utilizes the genome-to-genome all pairwise comparison data in our database (PCDB), thus the service is fast enough to be provided on the web even though the analysis involves tens of thousands sequences. The clusters with the input sequences can be characterized by performing, on the web, domain search (Conserved Domain Database (CDD), Protein Family Database (PFAM), and PROSITE database), multiple sequence alignment using CLUSTALW, and phylogenetic tree analysis.
  3. The links to the NCBI's protein information (Entrez Protein) are also provided. We believe that this web server will be a useful resource for characterizing proteins of unknown functions via comparative genomics.
  4. CLASSEQ is available at http://platcom.informatics.indiana.edu/CLASSEQ/
Web Services
  1. Clustering Web Services
  2. Help & Instruction
  3. Test Dataset
  4. Pairwise Comparison DB
Acknowledgment

Thie project is funded by NSF CAREER Award DBI-0237901 INGEN (Indiana Genomics I nitiatives), and AVIDD (Analysis and Visualization of Instrument-Driven Data) Li nux cluster.