zbMATH — the first resource for mathematics

Algorithmic rules extraction for complex nominal medical data mining. (English) Zbl 1031.68991
Summary: Most machine learning algorithms are used to process numerical or nominal data, the latter usually in the simple form of a single name. Well known algorithms that are used for this goal, are the top down induction of decision trees ones which are also suitable for data mining purposes. These algorithms output decision trees that are expressed with symbols; thus, from these trees result rules that are also full of symbols. Health professional users, especially doctors, are not familiar with this staff due to the fact that they consider it as a technical matter that they cannot cope with. So, the results produced by the above-mentioned algorithms are not easily understandable by these users. Machine learning rules expressed in natural language are more suited to health professionals and are naturally welcomed by them.
In this work the algorithmic extraction of rules from decision trees that are composed of complex nominal medical expressions is examined. Special implementations of classification algorithms fed with data, also expressed in natural language, can produce these trees. The rule-extraction algorithm could be used for data mining purposes, in case of complex nominal medical data, along with a machine learning algorithm, which should possess the capability to produce decision trees with nodes, leafs and values in a complex nominal form as well. For the latter purpose, the machine learning classification algorithm ID3 was implemented in Prolog in order to be able to handle data in a complex nominal form.
68U99 Computing methodologies and applications
68T50 Natural language processing
68P99 Theory of data