zbMATH — the first resource for mathematics

POST: Using probabilities in language processing. (English) Zbl 0751.68061
Artificial intelligence, IJCAI-91, Proc. 12th Int. Conf., Sydney/Australia 1991, 960-965 (1991).
[For the entire collection see Zbl 0741.68016.]
The paper reports an application of probabilistic models to language processing, namely the assignment of part of speech to words in open texts, experimented with the system POST (Part Of Speech Tagger). Three main topics are addressed, reporting improvements compared to already existing models:
(1) The run experiments regarding the amount of training data needed in moving to a new domain (thus handling unknown words).
(2) For limiting the size of the training set, a probabilistic model of word features to handle unknown words in uniformly integrated within the probabilistic model, and measured its contribution.
(3) The forward-backward algorithm is applied to accurately compute the most likely tag set.
The paper describes the algorithms used in the POST system, the extensions, and the results of the experiments.
Reviewer: N.Curteanu (Iaşi)
68T10 Pattern recognition, speech recognition