Problem:
How do you partition your database? there are 3 ways to partition your data depending on how you structure your stuff (read more here).
Today we will be talking about just one: horizontal partitioning.
so for example your database (that is currently full) is a nosql document database. you have outgrown the number of rows that one collection can support, and you need more.
There are two ways to perform this split.
We will talk about hash based partitioning (because i'm still trying to get to the subject of consistent hashing). This works by taking, say, a unique value from your row (e.g. user id, or some other unique value), and mapping it into a number by running it through a function (hash function).
// mutable
function hashFunction(url) {
let total = 0;
for (let i=0, i < url.length, i++) {
// url = "Mia"
const character = url[i]; // "M"
const ascii = character.charCodeAt(0); // 77
total += ascii;
}
// some magic using ascii map
return total;
}
// immutable
function hashFunction2(url) {
// .map .filter .reduce
return url.reduce((current, total, index) => {
const character = url[index]; // "M"
const ascii = character.charCodeAt(0); // 77
return total + ascii
}, 0)
}
const sample = "Mia";
hashFunction(sample); // 297http://tinyurl.com/tangliqun_5673 --> hashFunction() --> 3
// store
const hashNum = hashFunction('http://tinyurl.com/tangliqun_5673') // 3
const objectToStore = {
url: 'http://tinyurl.com/tangliqun_5673',
fullURL: 'http://google.com',
}const database = [
undefined,
undefined,
undefined,
objectToStore,
undefined,
undefined,
undefined,
undefined,
]
database.forEach(obj => {
if (obj.url === 'http://tinyurl.com/tangliqun_5673') return 'i found it'
})
// ----------------
// finding a url
// ----------------
// 1. user gives me "http://tinyurl.com/tangliqun_5673"
// 2. i run the url through a hash function and get 3
const hashNum = hashFunction('http://tinyurl.com/tangliqun_5673') // 3
// "http://tinyurl.com/tangliqun_5673" === 3
// 3. i look at the database at item 3
const mySearchData = database[3]
// 4, i return the fullURL to user
const toUser = mySearchData.fullURLthis hash function MUST always return the same mapping given the same value.
Some key properties of the hash function used in Hash based partitioning:
we now have a number, which we can use to direct your data to an allocated database.
only now you're stuck with 10 databases, no more, no less. BECAUSE your hash function only returns numbers 1-10, if you add a database called 'database 11', nothing will ever be sent there, because the index doesn't get generated! What do you do? you seriously shouldn't consider rehashing ALL YOUR DATA into numbers 1-11 and then generate all hash mappings again!
This is where consistent hasing (finally) comes in. a good way to think about it is to imagine a circle, upon which your databases sit in evenly spaced distances.

In Consistent Hashing, when the hash table is resized (e.g. a new cache host is added to the system), only ‘k/n’ keys need to be remapped where ‘k’ is the total number of keys and ‘n’ is the total number of servers. Recall that in a caching system using the ‘mod’ as the hash function, all keys need to be remapped.
so now instead of remapping 10 databases worth of data, we plan ahead:
how it works with an additional database: