Home
Help
Input
Output
Dataset |
The problem of species identification using DNA Barcode can be formulated as : given a reference library composed of DNA Barcode specimen sequences of known species and
an unknown DNA Barcode sequence, recognize the latter
into a species that is present in the library. Several methods have been developed
and adopted to automatically classify a DNA Barcode
sequence to a predefined species, such as tree-based
methods, similarity-based methods and diagnostic methods. However, each method has its own advantage and disadvanatge. The SPIDBAR can be used for species identification using DNA Barcode with the help of Random Forest methodology. Here, initially the features vector has been developed on the basis of composition of frequency of k-mer of different size and RF supervised learning approach was employed for classification purpose. To run this server, the user has to provide the set of reference sequence with known species label (in BOLD format) and query sequence with hypothetical label (in BOLD format). Also, the user has to provide atleast two query sequence to run the SPIDBAR. |
|