|
MaLDoSS: A web server for donor splice site prediction
This web server is meant for donor splice site prediction in eukaryotic genes. The true and false splice site sequence data are first encoded into numeric vectors and then the encoded vectors are used as input in random forest supervised machine learning technique for the prediction purpose. The “randomForest” package of R software was used for implementing random forest on the encoded data sets. The random forest model was trained with mtry=50 and ntree=1000 with other parameters as default setting. Here, the three different encoding procures are followed, which are mainly based on the concept of adjacent di-nucleotide dependencies in the splice site motifs.The parameter setting were remain same in all the three encoding procedures. At present, the server has been trained with human donor splice sites but very soon, it will be updated with other eukaryotic species lke cow (Bos taurus). |
Please Cite:
Meher, P. K., Sahu, T. K., & Rao, A. R. (2016). Prediction of donor splice sites using random forest with a new sequence encoding approach. BioData Mining, 9(1), 1-25.
Team: Prabina Kumar Meher, Tanmaya Kumar Sahu, A. R. Rao and S. D. Wahi
Contact: meherprabin@yahoo.com, tanmayabioinfo@gmail.com, arrao@iasri.res.in |