Untitled

used to make otherwise unattainable product requirements feasible
follows this principle 'recently requested data is likely to be requested again'
- also called the locality of reference principle
can exist in all level of architecture, but are often found at the level nearest to the front end where they are implemented to return data quickly without taxing downstream levels

application serve cache

cache directly on a request layer node
for local storage of response data
when a request is made, the node will quickly return local cached data if it exists.
if not in cache, the requesting node will query data from disk
can be in memory (RAM), e.g. stored in a variable or local disk (HDD) e.g. stored in local storage on browser
when there are many nodes, there can still have their own cache
- however if your load balancer randomly distributes requests across nodes, the same request will go to different nodes, increasing cache misses
- one way to overcome this is to have global caching or distributed caching