Proposed work of this paper includes Bayesian Network,Naïve Bayes, LIB SVM, Logistic Regression,Nearest Neighbor GE, RBF Network,Random Forest,and Simple CART. All algorithms are implemented using WEKA 3.7.
3.1. Evaluation Measures
Different measures/criteria’s are available to measure the performance of classification algorithms .
Some classification algorithms achieve the best for one criterion may not be the best for some other criteria. Proposed work of this paper use Accuracy, TN, TP, F-measure and ROC for evaluation to find the performance score. The following describes those criteria’s
Accuracy: the total number of prediction that were correct.
Accuracy = (TN + TP) / (TP + FP + ...view middle of the document...
It can be used to determine if two sets of data are significantly different from each other.
Performance Score is generated by conducting paired t-test with significance level 0.05 for 10-fold cross-validation results of each classifier pairs on each criteria and dataset to indicate one classifier is better than the other. Performance score of superior and inferior classifier equals to +1 and -1.The sum of performance scores from all datasets is the performance score of a classifier for that criterion. Repeat the same for other datasets and other criteria’s used in this experiment. Performance score is calculated using WEKA 3.7.
Table 1: Performance score (paired t-test)
Classifier Overall Accuracy TP rate TN Rate F- Measure Area Under ROC
BayesNet 9 5 5 8 14
Naïve Bayes -10 -6 9 -7 9
LibSVM -20 -21 18 -21 -21
Logistic Regression 6 13 -8 8 13
RBF Network -2 -8 6 -6 -3
Nearest Neighbor GE -1 1 -13 3 -15
Random Forest 8 8 -7 7 7
Simple CART 0 9 -10 8 -4
Table 2: Classification Results (10 fold cross validation)
Dataset Name Classifier Overall Accuracy TP Rate TN Rate F-Measure Area Under ROC
German Credit BayesNet 0.7443 0.4990 0.8494 0.5375 0.7788