Kademlia - Distributed Hash Table - Part 1
What is Kademlia
Lately I have been interested in distributed key value stores. Mainly I have been investigating kademlia for it’s simplicity. In this blog I will attempt to explain how kademlia works, and some of the obvious benefits of using a peer to peer distributed hash table in your applications.
The Node
In a peer to peer overlay network you need nodes to make a network. A node in our discussion is a network server. By server I mean a piece of software that accepts network communications and does things based on those communications. Below is a simple representation of what a node in a Kademlia DHT looks like:
As you can see, the datastructures of kademlia are very simple, and not very hard to grasp. (well all except for that strange K constant, which i am going to have to learn more about…) We are creating a very simple router, that keeps a table of how to find particular nodes of interest.
Finding Nodes in Kademlia
As you would probably guess, every single node in a kademlia network doesn’t know of every other node from the get go. Nodes are found on the network. As new nodes register, make store and find connections to this node, this node includes those nodes within it’s routing table described above, until the KBucket items start to fill up with Contacts. At that point Contacts are selectively removed and organized within the KBucket lists.
To find a node by an id, this node makes requests to contacts it knows about from the routing table this node knows about, asking for the location of the node it is searching for. Those nodes perform a lookup within thier own routing tables to find nodes that are “close” to the requested id. this node then stores these new contacts, and iteratively asks those nodes if they know of the requested node. This happens until the node is found.
More to come in future posts.