The Ultimate Guide to Caching and CDNs

Hayk Simonyan
10 min readJan 16, 2024

--

Imagine a company is hosting a website on a server in a Google Cloud data center in Finland. It may take around 100ms to load for users in Europe, but it takes 3–5 seconds to load for users in Mexico. Fortunately, there are strategies to minimize this request latency for far-away users.

These strategies are called Caching and Content Delivery Networks (CDNs), which are two important concepts in modern web development and systems design. Let’s break them down:

Caching

Caching is a technique used to improve the performance and efficiency of a system. It involves storing a copy of certain data in a temporary storage area (the cache) so that future requests for that data can be served faster.

Different Caching Strategies

Caching data can greatly improve the performance of applications. There are typically 4 common places where we can store cached data.

1. Browser Caching

Browser caching involves storing website resources on a user’s local computer. When a user revisits a site, the browser can load the site from the local cache rather than fetching everything from the server again.

Disabling Cache in Browser: Users can disable caching by adjusting browser settings. In most browsers, developers can disable cache from their developer tools, usually found in the network settings. For instance, you can use the “Disable cache” option in Chrome's Developer Tools > Network tab.

Storage Location: The cache is stored in a directory on the client’s hard drive, managed by the browser. Browser caches store HTML, CSS, and JS bundle files on the user’s local machine, typically in a dedicated cache directory managed by the browser.

Cache-Control Header: The Cache-Control: max-age=3600 directive tells the browser to cache the file for 3600 seconds (1 hour).

Cache Hit and Cache Miss

Cache Hit occurs when the requested data is already in the cache. On the other hand, a Cache Miss occurs when the requested data is not in the cache and needs to be fetched from the original source.

The Cache Ratio is the percentage of requests that are served from the cache compared to all requests. A higher ratio indicates a more effective cache.

You can check if the cache was hit or missed from Browsers’ developer tools indicating cache status in the network tab. A response header like X-Cache: Hit signifies a cache hit.

2. Server Caching

Server caching involves storing frequently accessed data on the server, reducing the need for expensive operations like database queries.

Location: Server-side caches are stored on the server itself or a separate cache server, either in memory (like Redis) or on disk.

Procedure: Typically, the server checks the cache for data before querying the database. If the data is in the cache, it is returned directly.

But if data is not in the cache, the server retrieves it from the database, returns it to the user, and then stores it in the cache for future requests.

Cache Invalidation

The procedure we saw above is the Write-Around cache where data is written directly to permanent storage, bypassing the cache. This is used when write performance is less critical.

We also have a Write-Through cache. This is when data is simultaneously written to the cache and the permanent storage. It ensures data consistency but can be slower.

Another type is a Write-Back Cache. In this case, data is first written to the cache and then to permanent storage at a later time. This improves write performance but risks data loss in case of a crash in the server or cache server.

The choice depends on data type, access pattern, and performance requirements.

Eviction Policies: The Decision Makers of Caching

Eviction policies make the crucial decisions about which items to evict (or remove) from the cache when it’s filled to the brim.

Let’s delve into some of the most common eviction policies:

  1. Least Recently Used (LRU): Imagine a line where the last one to arrive is the first one to leave. LRU operates on this principle. It tracks the usage history and discards the least recently accessed items first. It’s like a memory that prioritizes the freshest experiences.
  2. First In First Out (FIFO): FIFO is the epitome of fairness — the first item to enter the cache is the first to exit. This method doesn’t consider how often or how recently data was accessed. It’s like a respectful queue where everyone waits their turn, regardless of their importance.
  3. Least Frequently Used (LFU): LFU is like a popularity contest. It keeps tabs on how often an item is accessed. The items with the least hits get the boot first. This policy favors items that prove their worth by being frequently requested.

Each of these policies has its strengths and weaknesses, and the choice depends on the specific needs and behavior of the system in question. For instance, LRU is generally good for scenarios where the most recently accessed data is likely to be needed again. On the other hand, LFU is ideal for situations where access patterns don’t change quickly over time.

Adaptive Policies

Some systems implement adaptive policies, which combine elements of these basic strategies. These more complex algorithms can adjust their behavior based on the evolving patterns of data access, ensuring optimal cache performance across a variety of scenarios.

Custom Policies

In some cases, particularly complex systems may require custom-designed eviction policies. These are tailored specifically to the unique needs of the system and can consider various factors such as the size, type, and cost of retrieving each item.

Impact of Eviction Policies

The choice of an eviction policy can significantly impact the performance of a caching system. A well-chosen policy ensures that the cache stores only the most useful data, thereby reducing latency and improving response time.

3. Database Caching

Database caching stands as a cornerstone in the realm of caching strategies, playing a pivotal role in enhancing the performance of applications that heavily rely on database interactions.

Implementation

Database caching can be implemented in two primary ways:

  1. Internal Caching: This is where the database system itself maintains a cache. It’s akin to having a quick reference guide built right into the database.
  2. External Caching: In this approach, an external cache (like Redis or Memcached) works in tandem with the database. Think of it as having a dedicated assistant whose sole job is to remember frequently requested data.

How It Functions

When a query is sent forth into the system, the database cache steps in like a gatekeeper. It first checks its own memory to see if the answer to that query is already known. If the result is present in the cache (a cache hit), it’s returned directly, thus bypassing the need for the database to laboriously process the query again.

Dealing with Cache Misses

A cache miss is akin to hitting a temporary roadblock. When the cache doesn’t have the required data, the system proceeds to execute the query against the database. After retrieving the needed information, it doesn’t just stop there — it stores this result in the cache. This way, the next time the same query knocks on the door, the cache is ready with the answer.

Ideal Use Cases: When Database Caching Shines

Database caching is particularly beneficial for applications that are read-heavy — those where certain queries are like popular tunes played over and over. In these scenarios, caching can significantly reduce the time spent re-running the same queries, thereby improving overall performance.

Eviction Policies

Much like other caching systems, database caches aren’t infinite wells; they, too, need rules to decide what stays and what goes. They employ various eviction policies to manage their memory usage effectively. One common policy is LRU (Least Recently Used).

The Broader Impact

Implementing an effective database caching strategy can lead to remarkable improvements in application responsiveness and efficiency. It reduces the load on the database, speeds up data retrieval, and ensures a smoother user experience. However, like any powerful tool, it requires careful tuning and management. The choice of caching method, the size of the cache, and the eviction policies must all be calibrated to the specific needs of the application for optimal performance.

4. Content Delivery Networks (CDNs)

CDNs are a network of servers distributed geographically, generally used to serve static content such as JavaScript, HTML, CSS, images, and video assets. They cache the content from the original server and deliver it to users from the nearest CDN server.

How CDNs Work

The process of Content Delivery Networks (CDNs) is as follows:

  1. Initial Request: A user asks for a file — maybe an image, a video, or a web page.
  2. Nearest Server Response: This request is swiftly rerouted to the closest CDN server.
  3. Content Delivery: If this server already has the content cached, it’s delivered straight to the user.
  4. Fetching and Forwarding: In cases where the CDN server doesn’t have the content, it retrieves it from the origin server, stores it (caches it), and then sends it to the user. This step ensures that the next time someone asks for the same content, it’s ready to go instantly.

CDN Types: Push vs. Pull

Pull-based CDN: Here, the CDN plays a proactive role. It pulls content from the origin server upon the first user request, perfect for sites with regularly updated static content.

Push-based CDN: In this scenario, you’re in charge. You directly upload content to the CDN. This method is ideal for large files that don’t change often but need rapid distribution when they do. It’s akin to sending packages to a courier service for delivery.

Guiding CDN Behavior

We use request headers to control the CDN behavior.

  • Cache-Control: This header is the rulebook for how long the CDN and browser caches should store content, with directives like max-age, no-cache, public, and private.
  • Expires: It’s like an expiration date for content, marking when it becomes stale.
  • Vary: This header adapts the served content based on specific request headers, ensuring the right version of the content is delivered.

Choosing Between CDN and Origin Server

Opt for CDN When:

  • You’re distributing static assets (images, CSS, JavaScript).
  • High availability and performance across various regions are crucial.
  • Offloading the origin server is a priority.

Go Direct to Origin Server When:

  • The content is dynamic, frequently changing, or personalized.
  • Real-time processing or fresh data is needed.
  • Complex server-side logic is involved that can’t be handled by a CDN.

Overview

In summary, caching and CDNs are crucial in enhancing web performance, reducing latency, and ensuring a scalable and efficient delivery of content over the internet. They are fundamental in the architecture of any high-traffic, performance-critical web application.

CDN benefits

  • Reduced Latency: By serving content from locations closer to the user, CDNs significantly reduce latency.
  • High Availability and Scalability: CDNs can handle high traffic loads and are resilient against hardware failures.
  • Improved Security: Many CDNs offer security features like DDoS protection and traffic encryption.

Overall Caching Benefits

  • Reduced Latency: Faster data retrieval since the data is fetched from a nearby cache rather than a remote server.
  • Lowered Server Load: Caching reduces the number of requests to the primary data source, decreasing server load.
  • Improved User Experience: Faster load times lead to a better user experience.

If you’re new here, I’m Hayk. I help web developers secure their first tech jobs or advance to senior roles at the Web Dev Mastery community.

For weekly insights on web development that you won’t want to miss, subscribe to My Newsletter.

--

--