Friday, April 22, 2016

CACHE - distributed cache

http://www.codeproject.com/Articles/21508/Distributed-Caching-Using-a-Hash-Algorithm


In computing, a distributed cache is an extension of the traditional concept of cache used in a single locale. A distributed cache may span multiple servers so that it can grow in size and in transactional capacity. It is mainly used to store application data residing in database and web session data.

Hashing algorithm that can be used in distributed caching of data in web farms or implementing a distributed hash table (DHT).

We have a layer of web servers that handle the requests and a pool of machines that are used for caching. Each web server should be able to access any of the cache machines. The tricky aspect here is that when we cache some data as a result of a request to a particular web server, we should be able to retrieve that data from any of the other web servers.

Here is what we can do. We want to cache key value pairs and then at a later stage get hold of a cached value by providing the corresponding key. The code below takes a key and produces a hash bit array using SHA1. Then after some transformations we derive an integer number. A given key always produces the same number. In addition, all numbers are uniformly distributed which allows us to "page" them and assign them to a given number of cache machines or "buckets".