Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to andycap in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
andycap (0) - 1 freq handicap (2) - 2 freq andylaw (2) - 2 freq indycamp (2) - 3 freq andymc (3) - 1 freq aydeca (3) - 1 freq andy's (3) - 5 freq andean (3) - 1 freq andyh (3) - 1 freq landscep (3) - 1 freq handyman (3) - 2 freq andra (3) - 116 freq andrae (3) - 7 freq indylanp (3) - 1 freq indyah (3) - 1 freq landscape (3) - 50 freq anywan (3) - 28 freq dannycaz (3) - 2 freq anyway (3) - 109 freq andyrbn (3) - 2 freq andaman (3) - 1 freq bandcamp (3) - 1 freq and-aa (3) - 1 freq andrea (3) - 1 freq anyday (3) - 3 freq	andycap (0) - 1 freq handicap (3) - 2 freq indycamp (3) - 3 freq andylaw (4) - 2 freq andy (5) - 151 freq induced (5) - 5 freq andras (5) - 8 freq induce (5) - 1 freq bandcamp (5) - 1 freq and-aa (5) - 1 freq andrea (5) - 1 freq indicate (5) - 9 freq necp (5) - 1 freq inducin (5) - 3 freq induces (5) - 1 freq andaman (5) - 1 freq endup (5) - 1 freq end-up (5) - 5 freq anyday (5) - 3 freq andyh (5) - 1 freq landscep (5) - 1 freq andean (5) - 1 freq andy's (5) - 5 freq andymc (5) - 1 freq aydeca (5) - 1 freq	SoundEx code - A532 anticipation - 23 freq amidst - 7 freq antics - 13 freq auntics - 1 freq aunties - 16 freq andy's - 5 freq antique - 9 freq antigone - 2 freq auntie's - 7 freq antiquities - 3 freq andews - 1 freq amethyst - 2 freq andes - 1 freq antisyzygy' - 1 freq antichrist - 3 freq antiques - 5 freq ants - 5 freq anti-christ - 2 freq anti-social - 3 freq anti-scottish - 1 freq antisyzygy - 3 freq aunts - 10 freq anti-climax - 1 freq anti-stress - 1 freq aneth's - 1 freq antistius's - 1 freq anti-clart - 1 freq anits - 2 freq amidships - 1 freq antic - 1 freq anticipatan - 1 freq antechamber - 1 freq antiek - 1 freq an'-at's - 1 freq aund's - 1 freq -andz - 2 freq aunt's - 1 freq antiquarian - 2 freq amids - 1 freq anticipatit - 4 freq anti-conscription - 1 freq anti-cholesterol - 1 freq antiquitie - 2 freq antecestors - 4 freq anti-gaelic - 1 freq anticipates - 1 freq antisocial - 1 freq antsy - 1 freq antiquity - 2 freq anticipated - 3 freq antisyzygetic - 1 freq anti-establishment - 4 freq anti-spam - 1 freq anticipate - 1 freq anti-clockwise - 1 freq anticipatin - 3 freq amethysts - 1 freq anti-austerity - 4 freq anti-semitism - 1 freq anti-xenophobic - 1 freq anti-govrenment - 1 freq andthocht - 1 freq andycap - 1 freq antiquesroadshow - 1 freq antisocialism - 1 freq antiqueroadtrip - 1 freq andyconsidine - 2 freq andyclosee - 1 freq auntiesyzygy - 1 freq anti-catholic - 1 freq amidgetgem - 1 freq andyscargill - 1 freq anoticing - 1 freq antwegian - 1 freq andykiko - 2 freq antecedents - 2 freq aintist - 1 freq amitchellallen - 1 freq	MetaPhone code - ANTKP andycap - 1 freq	ANDYCAP
Time to execute Levenshtein function - 0.233681 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.426524 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.027960 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.042287 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.001101 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics