A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to cot-hous in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
cot-hous (0) - 2 freq
cothous (1) - 9 freq
covetous (3) - 1 freq
oot-hauds (3) - 1 freq
concious (3) - 4 freq
couscous (3) - 2 freq
cowtious (3) - 1 freq
het-houss (3) - 1 freq
motthors (3) - 2 freq
pot-holes (3) - 1 freq
cowshus (3) - 5 freq
copious (3) - 2 freq
co-ops (3) - 1 freq
oot-hoose (3) - 1 freq
ho-ho's (3) - 1 freq
cloths (4) - 1 freq
ootthrou (4) - 9 freq
coots (4) - 1 freq
without (4) - 100 freq
convoys (4) - 8 freq
stotious (4) - 1 freq
ootcrops (4) - 1 freq
cathy's (4) - 27 freq
catch-up (4) - 2 freq
washhous (4) - 1 freq
cot-hous (0) - 2 freq
cothous (2) - 9 freq
oot-hoose (4) - 1 freq
acid-house (5) - 1 freq
co-ops (5) - 1 freq
cathoose (5) - 1 freq
het-hoose (5) - 1 freq
pot-holes (5) - 1 freq
cowshus (5) - 5 freq
oot-hauds (5) - 1 freq
coort-hoose (5) - 1 freq
catches (5) - 34 freq
het-houss (5) - 1 freq
thus (6) - 40 freq
rat-hol (6) - 1 freq
scotchies (6) - 11 freq
tea-hoose (6) - 1 freq
scoot-hole (6) - 3 freq
clothes (6) - 28 freq
puir-hous (6) - 1 freq
ecrehous (6) - 1 freq
cottars (6) - 19 freq
boat-holes (6) - 1 freq
coshes (6) - 1 freq
coalhoose (6) - 3 freq
SoundEx code - C320
cottage - 49 freq
catch - 353 freq
cities - 43 freq
city's - 10 freq
cats - 124 freq
cat's - 32 freq
ceeties - 7 freq
cuddies - 49 freq
cuddie's - 10 freq
codes - 4 freq
cuts - 45 freq
coats - 28 freq
chats - 3 freq
cotch - 8 freq
cds - 20 freq
cadiz - 2 freq
cute's - 1 freq
cd's - 2 freq
cothous - 9 freq
cot-hous - 2 freq
cahoots - 2 freq
chat's - 1 freq
cuits - 2 freq
cuddies' - 2 freq
cut's - 2 freq
cit's - 1 freq
cathoose - 1 freq
cautch - 1 freq
coat's - 4 freq
cathy's - 27 freq
'cathy's - 2 freq
'cuddies - 1 freq
citz - 2 freq
cautious - 8 freq
cheats - 2 freq
cats' - 2 freq
catties - 3 freq
c-c-d's - 1 freq
'cheats' - 2 freq
couttie's - 3 freq
chotce - 1 freq
cïties - 2 freq
cottage' - 1 freq
'catch - 2 freq
cuithes - 4 freq
cots - 2 freq
cadgy - 1 freq
châteaus - 1 freq
chates - 1 freq
cadgie - 2 freq
cahootchie - 2 freq
catchy - 3 freq
cotts - 3 freq
cöts - 3 freq
caddies - 2 freq
chutes - 1 freq
cutties - 3 freq
cyties - 1 freq
cities' - 1 freq
cits - 1 freq
'catchie' - 1 freq
cites - 12 freq
caats - 4 freq
codgie - 2 freq
cowtious - 1 freq
cahoutchy - 1 freq
cootch - 3 freq
ceities - 8 freq
codds - 2 freq
cadge - 2 freq
coits - 1 freq
coutch - 1 freq
cutesy - 1 freq
chits - 1 freq
cíties - 1 freq
coattage - 1 freq
cooties - 1 freq
chaotic - 3 freq
coots - 1 freq
cods - 2 freq
€˜cuddies - 1 freq
coads - 1 freq
catchie - 2 freq
cottige - 1 freq
ceuithes - 1 freq
cattie's - 1 freq
czdq - 1 freq
caddis - 1 freq
cts - 1 freq
cuddy's - 1 freq
cattyish - 20 freq
cuddys - 1 freq
chdk - 1 freq
cyatcy - 1 freq
czdxi - 1 freq
cuties - 2 freq
coutts - 1 freq
caithess - 1 freq
ctdg - 1 freq
cedk - 1 freq
catwawk - 1 freq
MetaPhone code - KTHS
cot-hous - 2 freq
COT-HOUS
Time to execute Levenshtein function - 0.265111 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.441156 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.031741 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.041737 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.001202 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.