#textreuse: This #R package provides a set of functions for measuring #similarity among documents and detecting passages which have been reused. It implements shingled #n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; and minhash and locality sensitive hashing algorithms.
▻https://github.com/lmullen/textreuse
#text_mining