##
**Statistical approach to pattern recognition. Theory and practical solution by means of PREDITAS system.**
*(English)*
Zbl 0752.68070

Kybernetika 27, Suppl. No. 1-6, 78 pp. (1991).

The classification is treated as the primary goal of pattern recognition. There is a number of related problems requiring careful attention when solving problems from real life: evaluation of the training set quality, feature selection and dimensionality reduction, estimation of classification error, iterative corrections of individual phases of the solution according to the results of testing, and finally the problem of interconnecting feature selection with the classifier design as much as possible. An attempt to provide a complex solution of all these interconnected problems has resulted in the design of PREDITAS (Pattern REcognition and DIagnostic TAsk Solver) software package. It is a combination of both theoretically based and heuristic procedures, incorporating as much as possible requirements and suggestions of specialists from various application fields. Theoretical background of the employed methods and algorithms, together with their reasoning and some examples of applications is presented.

Reviewer: G.Stanke (Berlin)

### MSC:

68T10 | Pattern recognition, speech recognition |

62H30 | Classification and discrimination; cluster analysis (statistical aspects) |

### Keywords:

statistical pattern classification; Bayes rule; linear classification; searching methods; feature selection### Software:

PREDITAS### References:

[1] | T. W. Anderson: An Introduction to Multivariate Statistical Analysis. John Wiley, New York 1958. · Zbl 0083.14601 |

[2] | G. Biswas A. K. Jain, R. Dubes: Evaluation of projection algorithms. IEEE Trans. Pattern Anal. Machine Intell. 3 (1981), 701-708. |

[3] | S. Bláha P. Pudil, R. Pecinovský: Classification by sequential discriminative rule and its optimization by measure of discriminative power. Proceedings of DIANA - Conf. of Discr. Anal., CIuster Anal. and Others Methods on Data CIass, Liblice 1982, pp. 277-284. |

[4] | S. Bláha, P. Pudil: A general approach to diagnostic problem solution by means of pattern recognition. Problems Control Inform. Theory 13 (1984), 3, 192-208. · Zbl 0543.68072 |

[5] | S. Bláha, P. Pudil: The PREDITAS system and its use for computer-aided medical decision making. Medical Decision Making: Diagnostic Strategies and Expert Systems (J. H. Van Bemmel, F. Grćmy, J. Zvárová, North-Holland, Amsterdam 1985, pp. 215-218. |

[6] | S. Bláha J. Novovičová, P. Pudil: Solution of Pattern Recognition Problem by Means of the PREDITAS Program System. Part L: Dichotomic Classification - Theoretical Background, Research Report ÚTIA ČSAV No. 1549, Prague 1988. |

[7] | S. Bláha J. Novovičová, P. Pudil: Solution of Pattern Recognition Problem by Means of the PREDITAS Program System. Part II.: Feature Selection and Extraction Principles and Used Methods. Research Report ÚTIA ČSAV No. 1555, Prague 1988. |

[8] | S. Bláha J. Novovičová an P. Pudil: Solution of Pattern Recognition Problem by Means of the PREDITAS Program System. Part III: Sample-Based Classification Procedures. Research Report ÚTIA ČSAV No. 1593, Prague 1989. |

[9] | S. Bláha P. Pudil, F. Patočka: Program system PREDITAS and its application in geology (in Czech). Proceedings of International Symposium Mathematical Methods in Geology, Příbram 1989, pp. 6-17. |

[10] | C K. Chow: An optimum character recognition system using decision functions. IRE Trans. Electronic Computers EC-6 (1957), 6, 247-254. |

[11] | C K. Chow: On optimum \?ecognition error and reject tradeoff. IEEE T\?ans. Inform. Theory IT-16 (1970), 1, 41-46. · Zbl 0185.47804 |

[12] | T. M. Covei, J. M. Van Campenhout: On the possible orderings in the measurement selection problem. IEEE Trans. Systems, Man Cybernet. 7 (1977), 657-661. · Zbl 0371.62036 |

[13] | H. P. Decell, L. T. Guseman: Linear feature selection with applications. Pattern Recognition 11 (1979) 55-63. · Zbl 0412.62040 |

[14] | P. A. Devijver, J. Kittler: Pattern Recognition - A Statistical Approach. Prentice-Hall, Engelwood Cliffs 1982. · Zbl 0542.68071 |

[15] | R. Dubes, A. K. Jain: CIustering metodology in exploratory data analysis. Advances in Computers 19, Academic Press, New York 1980. |

[16] | R. O. Duda, P. E. Hart: Pattern Classification and Scene Analysis. J. Wiley, New York 1973. · Zbl 0277.68056 |

[17] | B. Efron: Bootstrap methods. Another look at the jackknife. Ann. Statist. 7 (1979), 1 - 26. · Zbl 0406.62024 |

[18] | B. Efron: The Jackknife, the Bootstrap and Other Resampling Plans. Society for Industrial and Applied Mathematics, Philadelphia 1982. · Zbl 0496.62036 |

[19] | R. A. Fisher: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7 (1936), Part II, 179-188. |

[20] | R. A. Fisher: Statistical Methods for Research Workers. Hafner, New York 1963. · JFM 64.0544.03 |

[21] | D. H. Foley: Considerations of sample and feature size: IEEE Trans. Inform. Theory IT-18 (1972), 5, 618-626. · Zbl 0242.68058 |

[22] | K. S. Fu: Applications of Pattern Recognition (K. S. Fu. CRC Press 1982. |

[23] | K. S. Fu: A step towards unification of syntactic and statistical pattern recognition. IEEE Trans. Pattern Anal. Machine Intell. 8, (1986), 398-404. · Zbl 0589.68062 |

[24] | K. Fukunaga, R. R. Hayes: Effects of sample size in classifier design. IEEE Trans. Pattern Anal. Machine Intell. 11 (1989), 8, 873-885. |

[25] | K. Fukunaga, R. R. Hayes: Estimation of classifier performance. IEEE Trans. Pattern Anal. Machine Intell. 11 (1989), 10, 1087-1101. |

[26] | N. Glick: Sample-based classification procedures derived from density estimators. J. Amer. Statist. Assoc. 67 (1972), 116-122. · Zbl 0241.62039 |

[27] | L. F. Guseman, Jr., H. F. Walker: On minimizing the probability of misclassification for linear feature selection. JSC International Technical Note JSC-08412, Johnson Space Center, Houston, Texas, August 1973. |

[28] | L. F. Guseman, Jr., H. F. Walker: On Minimizing the Probability of Misclassification for Linear Feature Selection: A Computational Procedure. The Search for Oii. Marcel Dekker, New York 1975. |

[29] | L. F. Guseman, Jr. B. C. Peters Jr., H. F. Walker: On minimizing the probability of misclassification for linear feature selection. Ann. Statist. 3 (1975), 661. · Zbl 0303.62048 |

[30] | D. J. Hand: Recent advances in error rate estimation. Pattern Recognition Lett. 4 (1986), 335-346. |

[31] | M. M. Kalayeh, D. A. Landgrebe: Predicting the requirement number of training samples. IEEE Trans. Pattern Anal. Machine Intell. 5 (1983), 664-667. |

[32] | L. Kanal: Patterns in pattern recognition 1968-1974. IEEE Trans. Inform. Theory IT-18 (1974), 618-626. · Zbl 0286.68055 |

[33] | L. Kanal, B. Chandrasekar: On dimensionality and sample size in statistical pattern classification. Pattern Recognition 3 (1971), 225-234. |

[34] | P. A. Lachenbruch: Discriminant Analysis. Hafner Press, London 1975. · Zbl 0354.62050 |

[35] | P. A. Lachenbruch, R. M. Mickey: Estimation of error rates in discrimining analysis. Technometrics 70 (1968), 1, 1-11. |

[36] | P. M. Lewis: The characteristic selection problem in recognition systems. IRE Trans. Inform. Theory 8 (1962), 171-178. · Zbl 0099.34505 |

[37] | W. Malina: On an extended Fisher criterion for feature selection. IEEE Trans. Pattern Anal. Machine Intell. 3 (1981), 611-614. |

[38] | T. Marill, D. M. Green: On the effectivness of receptors in recognition systems. IEEE Trans. Inform. Theory 9 (1963), 1, 11-17. |

[39] | P. M. Narendra, K. Fukunaga: A branch and bound algorithm for feature subset selection. IEEE Trans. Comput. 26 (1977), 917-922. · Zbl 0363.68059 |

[40] | N. J. Nilsson: Learning Machine - Foundations of Trainable Pattern Classifying Systems. McGraw-Hiü, New York 1965. · Zbl 0132.12005 |

[41] | R. Pecinovský P. Pudil, S. BIáha: The algorithms for sequential feature selection based on the measure of discriminative power. Proceedings of DIANA - Conf. on Discr. Anal., Cluster Anal. and Others Methods on Data Class., Liblice, 1982, pp. 277-284. |

[42] | P. Pudil, S. Bláha: Evaluation of the effectiveness of features selected by the methods of discriminant analysis. Pattern Recognition 14 (1981), Nos. 1 - 6, 81 - 85. |

[43] | P. Pudil, S. Bláha: A global approach to the solution of situation recognition. Fourth Formator Symposium on Mathematical Methods for Analysis of Large-Scale Systems, Liblice, May, 1982 (J. Beneš, L. Bakule. Academia, Praha 1983, pp. 405-418. |

[44] | P. Pudil S. Bláha, J. Novovičová: PREDITAS - software package for solving pattern recognition and diagnostic problems. Pattern Recognition - Proceedings of BPRA 4th Internat. Conf. on Pattern Recognition, Cambridge 1988 (J. Kittler. (Lecture Notes in Computer Science 301.) Springer-Verlag Berlin -Heidelberg-New York 1988, pp. 146-152. |

[45] | P. Pudil S. Bláha, Z. Pertold: Significance analysis of geochemical data for rock type discrimination by means of PREDITAS system (in Czech). Proceedings of International Symposium Mathematical Methods in Geology, Příbram 1989, pp. 119-125. |

[46] | G. Sebestyen: Decision Making Processes in Pattern Recognition. MacMillan, New York 1962. · Zbl 0108.13903 |

[47] | G. T. Toussaint: Bibliography on estimation of misclassification. IEEE Trans. Inform. Theory IT-20 (1974), 4, 472-479. · Zbl 0302.68103 |

[48] | S. Watanabe: Karhunen-Loève expansion and factor analysis. Trans. Fourth Prague Conf. on Information Theory, 1965. Academia, Prague 1967, pp. 635-660. |

[49] | W. G. Wee: Generalized inverse approach to adaptive multiclass pattern classification. IEEE Trans. Comput. 17 (1968), 1157-1164. · Zbl 0181.22504 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.