×

textreuse

swMATH ID: 17907
Software Authors: Lincoln Mullen
Description: R package textreuse. Detect Text Reuse and Document Similarity. Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.
Homepage: https://cran.r-project.org/web/packages/textreuse/index.html
Source Code: https://github.com/cran/textreuse
Dependencies: R
Keywords: CRAN; R package; Detect Text Reuse; Document Similarity
Cited in: 0 Publications