CLASSEQ Classification of Multiple Sequences
via Comparative Analysis of Multiple Genomes
CLASSEQ (Classification of Sequences via Comparative Analysis of Multiple Genomes) is
a web server where users can submit uncharacterized protein sequences to analyze them
in comparison with protein sequences from multiple genomes of user's choice.
The input sequences are classified into clusters of sequences using our sequence
clustering algorithm BAG, together with protein sequences from multiple genomes that
can be freely selected by users.
The clustering analysis of sequences utilizes the genome-to-genome all pairwise comparison
data in our database (PCDB), thus the service is fast enough to be provided on the web
even though the analysis involves tens of thousands sequences. The clusters with the
input sequences can be characterized by performing, on the web, domain search
(Conserved Domain Database (CDD), Protein Family Database (PFAM), and PROSITE database),
multiple sequence alignment using CLUSTALW, and phylogenetic tree analysis.
The links to the NCBI's protein information (Entrez Protein) are also provided.
We believe that this web server will be a useful resource for characterizing proteins of
unknown functions via comparative genomics.
Thie project is funded by NSF CAREER Award DBI-0237901 INGEN (Indiana Genomics I
nitiatives), and AVIDD (Analysis and Visualization of Instrument-Driven Data) Li
nux cluster.