×

Properties of Levenshtein metrics on sequences. (English) Zbl 0536.68063

Summary: Levenshtein dissimilarity measures are used to compare sequences in application areas including coding theory, computer science and macromolecular biology. In general, they measure sequence dissimilarity by the length of a shortest weighted sequence of insertions, deletions and substitutions required to transform one sequence into another. Those Levenshtein dissimilarity measures based on insertions and deletions are analyzed by a model involving valuations on a partially ordered set. The model reveals structural relationships among poset, valuation and dissimilarity measure. As a consequence, certain Levenshtein dissimilarity measures are shown to be metrics characterized by betweenness properties and computable in terms of well-known measures of sequence similarity.

MSC:

68R99 Discrete mathematics in relation to computer science
94B99 Theory of error-correcting codes and error-detecting codes
92D10 Genetics and epigenetics
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Boland, R. P., E. K. Brown and W. H. E. Day. 1983. ”Approximating Minimum-length-sequence Metrics: a Cautionary Note.”Math. Social Sci. 4, 261–270. · Zbl 0513.90047 · doi:10.1016/0165-4896(83)90028-8
[2] Boorman, S. A. and P. Arabie. 1972. ”Structural Measures and the Method of Sorting.” InMultidimensional Scaling, Vol. 1, Theory, Eds R. N. Shepard, A. K. Rommey and S. B. Nerlove, pp. 225–249. New York: Seminar Press.
[3] –, and D. C. Olivier. 1973. ”Metrics on Spaces of Finite Trees.”J. math. Psychol. 10, 26–59. · Zbl 0271.92011 · doi:10.1016/0022-2496(73)90003-5
[4] Bunke, H. 1983. ”What is the Distance Between Graphs?”Bull. Eur. Assoc. theor. Comput. Sci. No. 20, 35–39.
[5] Day, W. H. E. 1981. ”The Complexity of Computing Metric Distances Between Partitions.”Math. Social Sci. 1, 269–287. · Zbl 0497.62049 · doi:10.1016/0165-4896(81)90042-1
[6] Flament, C. 1963.Applications of Graph Theory to Group Structure. Englewood Cliffs, NJ, Prentice-Hall. · Zbl 0141.36301
[7] Goodman, N. 1951.The Structure of Appearance. Cambridge MA: Harvard University Press. · Zbl 0042.43902
[8] Kruskal, J. B. 1983. ”An Overview of Sequence Comparison: Time Warps, String Edits, and Macromolecules.”SIAM Rev. 25, 201–237. · Zbl 0512.68048 · doi:10.1137/1025045
[9] Levenshtein, V. I. 1966. ”Binary Codes Capable of Correcting Deletions, Insertions, and Reversals.”Soviet Phys. Dok. 10, 707–710.
[10] Lowrance, R. and R. A. Wagner. 1975. ”An Extension of the String-to-string Correction Problem.”J. Assoc. Comput. Mach. 22, 177–183. · Zbl 0301.68028 · doi:10.1145/321879.321880
[11] Masek, W. J. and M. S. Paterson. 1983. ”How to Compute String Edit Distances Quickly” InTime Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, Eds D. Sankoff and J. B. Kruskal, pp. 337–349. Reading, MA: Addison-Wesley.
[12] Monjardet, B. 1981. ”Metrics on Partially Ordered Sets–a Survey.”Discrete Math. 35, 173–184. · Zbl 0463.46016 · doi:10.1016/0012-365X(81)90206-5
[13] Needleman, S. B. and C. D. Wunsch. 1970. ”A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins.”J. molec. Biol. 48, 443–453. · doi:10.1016/0022-2836(70)90057-4
[14] Restle, F. 1959. ”A Metric and an Ordering on Sets.”Psychometrika 24, 207–220. · Zbl 0087.15903 · doi:10.1007/BF02289843
[15] Robinson, D. F. 1971. ”Comparison of Labeled Trees with Valency Three.”J. Combin. Theory 11, 105–119. · Zbl 0217.31201 · doi:10.1016/0095-8956(71)90020-7
[16] Sankoff, D. and J. B. Kruskal, Eds. 1983.Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Reading MA: Addison-Wesley.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.