×

Regular expressions into finite automata. (English) Zbl 0811.68096

It is well-known that the regular expressions can be translated into non- deterministic finite automata \((NFA)\) with or without \(\lambda\)- transitions. The standard method first translate a regular expression, in linear time, into an NFA with \(\lambda\)-transitions and then the \(\lambda\)-transitions are eliminated in quadratic time. There is another method, proposed by Glushkov and based on the fact that, if a word is denoted by an expression, it must be possible to spell out that word by tracing an appropriate “path” through the expression. It was shown that the Glushkov’s method can be described in terms of Brzozowski derivatives and on the other hand this method plays an important role in document processing.
In this paper the author shows that the Glushkov automaton associated to a regular expression can be constructed in quadratic time in the size of the expression, and in the case of deterministic expressions the construction can be done in linear time. The result is obtained first for regular expressions in star normal form (also introduced in this paper) and then it is translated to the general case. The paper is directed then on the unambiguity of the regular expressions and it is shown that the weak unambiguity can be tested in quadratic time improving the methods known until now. Finally, some interesting open problems are mentioned.

MSC:

68Q45 Formal languages and automata

Software:

AWK
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Aho, A.V.; Kernighan, B.W.; Weinberger, P.J., The AWK programming language, (1988), Addison-Wesley Reading, MA · Zbl 0751.68009
[2] Aho, A.V.; Sethi, R.; Ullman, J.D., Compilers: principles, techniques, and tools, () · Zbl 1155.68020
[3] Albert, J.; Ottmann, T., Automaten, sprachen und maschinen, (1983), Bibliographisches Institut Mannheim · Zbl 0524.68034
[4] Berry, G.; Sethi, R., From regular expressions to deterministic automata, Theoret. comput. sci., 48, 117-126, (1986) · Zbl 0626.68043
[5] Book, R.; Even, S.; Greibach, S.; Ott, G., Ambiguity in graphs and expressions, IEEE trans. comput., C20, 149-153, (1971) · Zbl 0222.94067
[6] Brüggemann-Klein, A.; Wood, D., Deterministic regular languages, (), 173-184
[7] Brzozowski, J.A., Derivatives of regular expressions, J. ACM, 11, 481-494, (1964) · Zbl 0225.94044
[8] Chen, C.-H.; Paige, R., New theoretical and computational results for regular languages, (), Third Symp. Combinatorial Pattern Matching, accepted
[9] Dougherty, D.; O’Reilly, T., UNIX text processing, (1987), Hayden Books Indianapolis
[10] Glushkov, V.M., The abstract theory of automata, Russian math. surveys, 16, 1-53, (1961) · Zbl 0104.35404
[11] Hennie, F.C., Finite-state models for logical machines, (1968), Wiley New York · Zbl 0187.28104
[12] Hopcroft, J.E.; Ullman, J.D., Introduction to automata theory, languages and computation, () · Zbl 0196.01701
[13] ISO 8879, Information processing – text and office systems – standard generalized markup language(SGML), (1986), International Organization for Standardization
[14] Perrin, D., Finite automata, () · Zbl 0336.94033
[15] Sippu, S.; Soisalon-Soininen, E., Parsing theory, vol. 1, languages and parsing, () · Zbl 0651.68007
[16] Staubach, G., UNIX-wekzeuge zur textmusterverarbeitung, (1989), Springer Berlin
[17] Wood, D., Theory of computation, (1987), Wiley New York
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.