Friday, August 11, 2017

RAM sizing Data Node of Couchbase Cluster

This post talks about finding out how much RAM does your Couchbase cluster needs for holding your Data (in RAM)! 


RAM Calculator 

RAM is one of the most crucial areas to size correctly. Cached documents allow the reads to be served at low latency and high throughput.  Please note that, this doesn't not incorporate RAM requirement from the host/VM OS and other applications running along with Couchbase.

Enter below fields to estimate RAM -

Sample Document        (key)    (Value) 
This is required as document content length as well as ID length impacts RAM. Be mindful of the size aspect when deciding your key generation strategy. 


# Replicas                                        
Couchbase only supports upto 3 replicas. So enter either - 1, 2 or 3.


% Of Data you want to be in RAM  %
For best throughput you need to have all your documents in RAM i.e. 100% . This way any request will be served from RAM and there will be no IO.  In the field please enter only the value like 80, 100 etc. 


# Documents                                   
Number of documents in the cluster. When your application is starting from scratch then you can start with a number depending on the load of the application and then you need to evaluate it regularly and adjust your RAM quota if required. So, you can start with say 10000 or 1000000 documents. 


Type of Storage                                SSD        HDD
If storage is SSD then overhead % is 25 else it's 30%. SSD will bring better performance in disk throughput and latency. SSD storage will help improved performance if all data is not in the RAM. 


Couchbase Version                        < 2.1       2.1 or higher  
Size of meta data for 2.1 and higher versions is 56 bytes but for lower versions it's 64. 


High Water Mark                             %
If you want to use default value enter 85. 
If the amount of RAM used by documents reaches high water mark (upper threshold), both primary and replica documents are ejected until the memory usage reaches low Water Mark (lower threshold). 

                                                          

Based on the RAM requirement for the cluster, you can plan how many nodes are required. Another important aspect in deciding number of data nodes is how you expect your system to behave if 1, 2 or more nodes go down at the same time. This link, I have discussed about Replication factor and how it affects your system performance. So, take your call wisely!

The value got calculated as explained in the Couchbase link, here.
Reference for calculating document size is, here

--- happy sizing :)

No comments:

Post a Comment