## Nearest "Word" Problem

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

The Black Hand
Posts: 1037
Joined: Tue Feb 12, 2008 11:40 am UTC
Location: Behind you

### Nearest "Word" Problem

I've got to implement a system that, given a "signature" (a,b,c,d) where a-d are single digits, returns nearest neighbours within a data structure, with a percentage difference.

Naively, you could use a binary search tree, treating the signature as a single int for the key, but there's a chance that you'll miss a neighbour without doing a full iteration of the tree, so I want to do something different. My current thought is to do something similar to a binary tree, but using the difference (using Euclidean distance), and intertwining a sorted doubly-linked list (so each node has pointers left, right, previous and next, and maybe parent), using the tree as the primary search, and using the linked list to quickly find neighbours. What are your thoughts on this?

I've read about BK Trees, but I can't find any good information on them, except for this blog post. Does anyone know where else I can try looking for info on BK Trees, or if there's a better structure I should look into?
73, de KE8BSL loc EN26.

freakish777
Posts: 354
Joined: Wed Jul 13, 2011 2:14 pm UTC

### Re: Nearest "Word" Problem

Sounds like you're learning about K Nearest Neighbors?

R-Trees, M-Trees, K-D Trees and BSP-Trees should all work (each solves the problem in a slightly different fashion). Each has a (much) larger wiki entry than BK Trees.

korona
Posts: 495
Joined: Sun Jul 04, 2010 8:40 pm UTC

### Re: Nearest "Word" Problem

If the number of dimensions is large there is locality sensitive hashing, if it is low there are the trees mentioned above.
Unfortunately I don't know if 4 is large .

The Black Hand
Posts: 1037
Joined: Tue Feb 12, 2008 11:40 am UTC
Location: Behind you

### Re: Nearest "Word" Problem

It looks like a k-d tree will do what I want. Thanks!
73, de KE8BSL loc EN26.