3 Results and discussion
Basic statistics of the selected molecular descriptors are provided in Table 2. Values of the standard deviation and coefficient of variation (CoV) suggest that the selected descriptors exhibited high variability. The CoV values of the descriptors ranged between -63.07 % (PNSA-2) and 155.63 % (RPCS). The electronic descriptors exhibited highest variability (-63.07 % - 155.63 %) followed by topological (36.71 % - 73.53 %), geometrical (74.57 %) and topological (45.61 % - 65.12%). A wide variability in molecular properties of the considered pesticides reveals the importance of the selected descriptors for proposed QSTR modeling studies.
3.1 Qualitative QSTR modeling
A qualitative QSTR model was constructed to categorize the pesticides among toxic and non-toxic (two-categories) as well as among highly toxic, moderate toxic, slightly toxic, and non-toxic (four- categories). Accordingly, PNN based QSTR models were established for two and four-category classifications of the considered pesticides using the set of selected molecular descriptors (Table 2). Optimal architecture and the model parameters were determined through 5-fold CV, whereas, for external validation, a sub-set of test data was used. The classification accuracies obtained in CV of two and four-categories QSTRs ranged between 70.21 % - 77.08 % and 74.47 %- 85.94 %, respectively. The results indicate that the toxicity prediction accuracies of both the QSTR models are comparable in two and four-category classification. The results have also shown no obvious over-fitting of the data.
The Y-randomization tests were performed both for the two- and four-category classification of pesticides using 5-fold CV procedure. Average values of misclassification rate in two- and four category QSTRs were 27.51 % and 43.67 %, respectively, which are significantly higher (3.38 % and 8.86 %) than those of the respective original classification QSTRs. This suggests that the original classification QSTR models are relevant and unlikely to arise as a result of chance of correlation.
3.1.1 Qualitative QSTR (two-category)
The selected four-layered optimal PNN-QSTR model has five neurons in the input layer, 175 neurons in pattern layer, two neurons in the decision layer and single neuron in the output layer. The value of the spread (σ) parameter of the reciprocal function was optimized. Here, separate σ values were considered for each of the input variable and search for each was made in the range of 0.001-10. Selection of σ values for each variable provided a relatively better model as compared to the single model σ value. The optimal values of σ for the considered input variables ranged between 0.025-0.065. The optimal PNN-QSTR model was applied to the test and complete data arrays. The discriminating descriptors for two-category classification of pesticides in the QSTR model were determined in view of their importance in corresponding model. The contribution of the...