Enterprise Caching Techniques – Standalone Caching

Many application domains have more fetched concentric requirements and with very few store operations. Like in E-Commerce, where buyer’s search versus purchase ratio is 9:1 or sometime even wider. Such applications require additional layer of caching in their architecture. Caching is not something new and invented recently, it is there since era of hardware evolution started. What we see with any hardware architecture in form of L1 and L2 CPU cache, are caching mechanism and still in use. L1 and L2 reside in between of processor and RAM, and contain system critical information for processing. Fetching of data from those caches are faster as compare to RAM but size of it is quite small as compare to main memory. This further helps to bifurcate type of data and helps CPU to decide its storing location. Caching with enterprise application directly derives from that same concept. However here it is may be in same CPU or in different machines/nodes connected with parent with very high network cards. So mainly caching in enterprise application is divided into two parts i.e. Standalone Caching and Distribute Caching.

Standalone Caching

Sometime referred as embedded or in-process caching, is single virtual machine based technique of storing frequently asked data. Standalone caching acts as L1 cache from application perspective and resides in RAM.
The main purpose of using the standalone caching is to improve the performance of the business critical operations. The standalone caching has limited main memory as its disposal. Therefore only data that is frequently used and important for the business critical functions is cached. Standalone caching products are always used as a side-cache for an application’s data access layer. Sidecache refers to an architectural pattern in which the application manages the caching of data from a database or filesystem or from any source. In this scenario, cache is used to temporarily store the objects. Applications first checks existing copy of data and returns if present. When data is not present, it retrieves from data access layer and put into cache for next incoming request.
In caching, some mechanism is required to cope with invalid cached data, data which is updated and still not refreshed in cache. There are several techniques that can be used to deal with invalid data or to remove unused caches to free some memory for other in-demand data.

Such concerns can be handled with by writing API which can take care invalid cache.
The caching product like EHCache provides basic functionality to handle invalidate data. The application decides at what point cached data should be invalidated. Typically strategy employed is whenever data is updated at store, application invalidates the cached data. If copy of cached data is not vital to update on the spot, we can apply some other techniques which can periodically refresh cache by assigning some time based configuration. We can even combine both techniques for multi-server environment.

There are also some other ways to update and remove cached data. With TTL(time-to-live) or LRU(Least frequently used) configuration we can monitor individual cache and take action for them with the help of API.

Problem with Standalone cache is, it is very limited and only can be used with single node/machine architecture. Hence need for distributed cache arisen, next in same series.

Memory Based Architecture For Enterprise Application – Introduction

We had this architecture discussion in one of the technical meetings in company recently and I was assigned to share all details on Memory Based Architecture. Sharing details from those sessions.

Memory, changing philosophy with enterprise applications and  Memory Based Architecture:

The main memory is high bandwidth and low latency component that can match performance of the processor in the computer. The bandwidth of main memory is around few GB per second as oppose to disk which is around hundred MB per second. The latency of main memory is in nanoseconds range where as that of disk is in milliseconds range. Traditionally main memory was considered as expensive resource and therefore it was scarcely used. However this perception that RAM is expensive component is now changing due to sharp drop in prices over past several years. Same time enterprise applications require more scalable and performance oriented use of those each chunk of available physical memory. Today they have enormous amount of such main memory cheaply available. Many applications are using memory in Gigabytes and Terabytes. The main memory empowers application architectures to achieve linear scalability and high performance. These qualities are extremely important to the modern enterprise applications for delivering guaranteed high performance under intensive and unpredictable workload.

As enterprises are using more memory, software vendors have flooded the market with several types of memory based products in order to size this new business opportunity. These products are targeted towards supporting various business use cases and architectural scenarios. This series is intended to introduce various memory based product categories along with business uses and architectural scenarios supported by them. 

When we think of any memory based products then high performance is the first thing that comes to our mind. Yes, high performance is primary reason why memory based products are used, but it is not the ‘only reason’. Many a times they are deployed to reduce IO operations over network or address the high latency issues with disk based products like databases. Typically with N Tier Architecture, properly design application code can easily scale out by adding more application servers. However the main scalability barrier is disk based database which is centrally access by all the clustered application servers. Here memory based products are typically deployed to overcome scalability bottleneck pose by disk based database and make application servers more scalable. Thus following can be considered as primary scenarios for any memory based product.

– Improve application performance
– Reduce network & disk IO Operations
– Overcome scalability barriers & make application servers more scalable.

The memory based products can be broadly classified as Caching(Standalone & Distributed Caching), In Memory Data Grid (IMDG), Main Memory Database (MMDB) and Application Platforms that enables Space Based Architecture and covered in great details under this series.