Dataset Information: The dataset has full-length annotated viral suppressor of RNA silencing in the positive dataset and viral sequences not annotated as viral suppressor of RNA silencing as negative dataset.
Positive Dataset Negative Dataset
Methodology
Best Classifier
Implemented best classifier based on
# training dataset with redundancy threshold 70%
# optimized feature vectors 77 out of 1537 based on Amino Acid Composition, Auto-correlation Coeffients, Composition Transition and Distribution of various physico-chemical properties, Pseudo Amino Acid Composition and Quasi Sequence Order structure.
# Random Forest algorithm, number trees 40
# accuracy of 86.11%, Balanced accuracy rate of 86.22%, MCC of 0.57 and auROC of 0.95
Statistical Evaluators
where TP: True positive, TN: True negative, FN: False negative, FP: False positive