Predicted 6mA status will appear here
N6-methyladenine (6mA) is a modified form of the DNA base adenine, where a methyl group is attached to the sixth position nitrogen base. The genomes of several kingdoms, including bacteria, fungi, plants, and mammals, contain this epigenetic alteration. Contrary to the well-researched 5-methylcytosine (5mC), the function of 6mA in molecular biology is yet to be explored. Recent studies suggest that 6mA may impact the integrity of the genome, DNA repair, and transcriptional control. Understanding the functions and mechanisms of 6mA can provide valuable insights into epigenetic regulation and its implications in development, disease development, and evolution.
MethSemble-6mA is a 6mA prediction tool developed by combining ensemble machine learning, hybrid feature selection and bootstrap samples. The tool utilizes five different feature sets for DNA sequence vectorization: di-nucleotide frequency, GC content, AMIP, mono-nucleotide binary encoding and nucleotide chemical properties. Nine machine learning models, including support vector machine, random forest, k-nearest neighbor, artificial neural network, multiple logistic regression, decision tree, naïve Bayes, AdaBoost, and gradient boosting, were employed with relevant features selected through the feature selection module. The top three best-performing models, gradient boosting, random forest, and SVM, were then combined into a robust ensemble model for predicting sequences with 6mA sites.
Copyright © 2023 ICAR-Indian Agricultural Statistics Research Institute, New Delhi-110012. All rights reserved.
Predicted 6mA status will appear here
Copyright © 2023 ICAR-Indian Agricultural Statistics Research Institute, New Delhi-110012. All rights reserved.
Rice (Oryza sativa) and Arabidopsis (Arabidopsis thaliana) are considered as model plants for mnocot and dicot respectively. The Rice and Arabidopsis were considered for the development of the model. The rice dataset consists of 308000 sequences. On the other hand Arabidopsis dataset consists of 63746 sequences. The ratio of positive and negative sequences is 1:1.
The RiceLv and Arabidopsis is available for download. The Nipponbare dataset is available for download.
Copyright © 2023 ICAR-Indian Agricultural Statistics Research Institute, New Delhi-110012. All rights reserved.
The dataset consist of three sequences where one positive and one sequence is negative.
Please download the example file by clicking the “Download FASTA File” below:
Copyright © 2023 ICAR-Indian Agricultural Statistics Research Institute, New Delhi-110012. All rights reserved.
Ph.D. Scholar
Discipline of Bioinformatics
ICAR-Indian Agricultural Research Institute
Pusa, New Delhi-110012, India.
Contact mail: diprosinha[at]gmail.com
Scientist
ICAR-National Bureau of Fish Genetic Resources
Dilkusha Marg, Lucknow, UP-226002, India.
Scientist
Division of Statistical Genetics
ICAR-Indian Agricultural Statistics Research Institute
Library Avenue, Pusa, New Delhi-110012, India.
Scientist
School of Crop Science
ICAR-Indian Agricultural Research Institute
Hazaribagh, Jharkhand-825109, India.
Scientist
Division of Agricultural Bioinformatics
ICAR-Indian Agricultural Statistics Research Institute
Library Avenue, Pusa, New Delhi-110012, India.
Ph.D. Scholar
Discipline of Molecular Biology and Biotechnology
ICAR-Indian Agricultural Research Institute
Pusa, New Delhi-110012, India.
Senior Scientist
Division of Agricultural Bioinformatics
ICAR-Indian Agricultural Statistics Research Institute
Library Avenue, Pusa, New Delhi-110012, India.
Senior Scientist
Division of Computer Applications
ICAR-Indian Agricultural Statistics Research Institute
Library Avenue, Pusa, New Delhi-110012, India.
Assistant Director General (ICT)
Indian Council of Agricultural Research
Central Secretariat, New Delhi-110001, India.
ICAR-National Fellow and Principal Scientist
ICAR-National Bureau of Plant Genetic Resources
Pusa, New Delhi-110012, India.
Contact Mail: sunil.archak[at]icar.gov.in
Copyright © 2023 ICAR-Indian Agricultural Statistics Research Institute, New Delhi-110012. All rights reserved.
Discipline of Bioinformatics, Graduate School
ICAR-Indian Agricultural Research Institute
Pusa, New Delhi-110012
Mail Id: alka.arora@icar.gov.in
ICAR-Indian Agricultural Statistics Research Institute
Pusa, New Delhi-110012
Mail Id: director.iasri@icar.gov.in
Division of Agricultural Bioinformatics
ICAR-Indian Agricultural Statistics Research Institute
Pusa, New Delhi-110012
Mail Id: hd.cabin.iasri@icar.gov.in
Copyright © 2023 ICAR-Indian Agricultural Statistics Research Institute, New Delhi-110012. All rights reserved.