Species Identification using DNA Barcode




    The problem of species identification using DNA Barcode can be formulated as : given a reference library composed of DNA Barcode specimen sequences of known species and an unknown DNA Barcode sequence, recognize the latter into a species that is present in the library. Several methods have been developed and adopted to automatically classify a DNA Barcode sequence to a predefined species, such as tree-based methods, similarity-based methods and diagnostic methods. However, each method has its own advantage and disadvanatge. The SPIDBAR can be used for species identification using DNA Barcode with the help of Random Forest methodology. Here, initially the features vector has been developed on the basis of composition of frequency of k-mer of different size and RF supervised learning approach was employed for classification purpose. To run this server, the user has to provide the set of reference sequence with known species label (in BOLD format) and query sequence with hypothetical label (in BOLD format). Also, the user has to provide atleast two query sequence to run the SPIDBAR.

