Many application domains have more fetched concentric requirements and with very few store operations. Like in E-Commerce, where buyer’s search versus purchase ratio is 9:1 or sometime even wider. Such applications require additional layer of caching in their architecture. Caching is not something new and invented recently, it is there since era of hardware evolution started. What we see with any hardware architecture in form of L1 and L2 CPU cache, are caching mechanism and still in use. L1 and L2 reside in between of processor and RAM, and contain system critical information for processing. Fetching of data from those caches are faster as compare to RAM but size of it is quite small as compare to main memory. This further helps to bifurcate type of data and helps CPU to decide its storing location. Caching with enterprise application directly derives from that same concept. However here it is may be in same CPU or in different machines/nodes connected with parent with very high network cards. So mainly caching in enterprise application is divided into two parts i.e. Standalone Caching and Distribute Caching.
Sometime referred as embedded or in-process caching, is single virtual machine based technique of storing frequently asked data. Standalone caching acts as L1 cache from application perspective and resides in RAM.
The main purpose of using the standalone caching is to improve the performance of the business critical operations. The standalone caching has limited main memory as its disposal. Therefore only data that is frequently used and important for the business critical functions is cached. Standalone caching products are always used as a side-cache for an application’s data access layer. Sidecache refers to an architectural pattern in which the application manages the caching of data from a database or filesystem or from any source. In this scenario, cache is used to temporarily store the objects. Applications first checks existing copy of data and returns if present. When data is not present, it retrieves from data access layer and put into cache for next incoming request.
In caching, some mechanism is required to cope with invalid cached data, data which is updated and still not refreshed in cache. There are several techniques that can be used to deal with invalid data or to remove unused caches to free some memory for other in-demand data.
Such concerns can be handled with by writing API which can take care invalid cache.
The caching product like EHCache provides basic functionality to handle invalidate data. The application decides at what point cached data should be invalidated. Typically strategy employed is whenever data is updated at store, application invalidates the cached data. If copy of cached data is not vital to update on the spot, we can apply some other techniques which can periodically refresh cache by assigning some time based configuration. We can even combine both techniques for multi-server environment.
There are also some other ways to update and remove cached data. With TTL(time-to-live) or LRU(Least frequently used) configuration we can monitor individual cache and take action for them with the help of API.
Problem with Standalone cache is, it is very limited and only can be used with single node/machine architecture. Hence need for distributed cache arisen, next in same series.