All the three datasets have been collected from http://dmb.iasi.cnr.it/blog.php and all these datasets have been used in earlier studies (Van Velzen et al., 2012; Weitschek et al., 2014). ED-I contains real barcode sequences different species belong to Cypreidae family,Drosophila, Inga trees, Fish, Bird and Bat. ED-II contains real barcode sequences of diverged species belong to Cypreidae family, Drosophila, and Inga trees SD-I contains 100 sets (100 training and 100 test sets) of simulated barcode sequences belong to 50 species To know more detail about these datasets, one can refer original articles: Van Velzen R, Weitschek E, Felici G, Bakker FT. DNA Barcoding of recently diverged species: relative performance of matching methods. PLoS One. 2012; 7(1): e30490. Weitschek E, Fiscon G, Felici G. Supervised DNA Barcodes species classification: analysis, comparisons and results. BioData Mining. 2014; 7: 4.