Breif

Caching is the ability to store copies of frequently accessed data in several places along the request-response path. When a consumer requests a resource representation, the request goes through a cache or a series of caches (local cache, proxy cache or reverse proxy) toward the service hosting the resource. If any of the caches along the request path has a fresh copy of the requested representation, it uses that copy to satisfy the request. If none of the caches can satisfy the request, the request travels all the way to the service (or origin server as it is formally known).

All forms of Caching in computer science, whether it be CPU cache, HTTP Web Server cache, Database cache and so on, aims to speed up response times for whatever is requested. Doing so helps reduce load as much as possible on the component that is being actively cached.

Caching solutions and strategies enhance page delivery speed significantly and reduce the work needed to be done by the backend server.

Caching servers can be set to refresh at specific intervals or in response to certain events to ensure that the freshest content is cached (useful for rapidly changing information, such as breaking news or rapidly changing pricing).

Caching can also protect against total outages, delivering already cached content when servers are down.

Benefits

Effective caching aids both content consumers and content providers. Some of the benefits that caching brings to content delivery are:

Decreased network costs: Content can be cached at various points in the network path between the content consumer and content origin. When the content is cached closer to the consumer, requests will not cause much additional network activity beyond the cache.
Improved responsiveness: Caching enables content to be retrieved faster because an entire network round trip is not necessary. Caches maintained close to the user, like the browser cache, can make this retrieval nearly instantaneous.
Increased performance on the same hardware: For the server where the content originated, more performance can be squeezed from the same hardware by allowing aggressive caching. The content owner can leverage the powerful servers along the delivery path to take the brunt of certain content loads.
Availability of content during network interruptions: With certain policies, caching can be used to serve content to end users even when it may be unavailable for short periods of time from the origin servers.

Terminology

When dealing with caching, there are a few terms that you are likely to come across that might be unfamiliar. Some of the more common ones are below:

Origin server: The origin server is the original location of the content. If you are acting as the web server administrator, this is the machine that you control. It is responsible for serving any content that could not be retrieved from a cache along the request route and for setting the caching policy for all content.
Cache hit ratio: A cache’s effectiveness is measured in terms of its cache hit ratio or hit rate. This is a ratio of the requests able to be retrieved from a cache to the total requests made. A high cache hit ratio means that a high percentage of the content was able to be retrieved from the cache. This is usually the desired outcome for most administrators.
Freshness: Freshness is a term used to describe whether an item within a cache is still considered a candidate to serve to a client. Content in a cache will only be used to respond if it is within the freshness time frame specified by the caching policy.
Stale content: Items in the cache expire according to the cache freshness settings in the caching policy. Expired content is “stale”. In general, expired content cannot be used to respond to client requests. The origin server must be re-contacted to retrieve the new content or at least verify that the cached content is still accurate.
Validation: Stale items in the cache can be validated in order to refresh their expiration time. Validation involves checking in with the origin server to see if the cached content still represents the most recent version of item.
Invalidation: Invalidation is the process of removing content from the cache before its specified expiration date. This is necessary if the item has been changed on the origin server and having an outdated item in cache would cause significant issues for the client.

What Can be Cached?

Certain content lends itself more readily to caching than others. Some very cache-friendly content for most sites are: - Logos and brand images - Non-rotating images in general (navigation icons, for example) - Style sheets - General Javascript files - Downloadable Content - Media Files

These tend to change infrequently, so they can benefit from being cached for longer periods of time.

Some items that you have to be careful in caching are:

HTML pages
Rotating images
Frequently modified Javascript and CSS
Content requested with authentication cookies

Some items that should almost never be cached are:

Assets related to sensitive data (banking info, etc.)
Content that is user-specific and frequently changed

In addition to the above general rules, it’s possible to specify policies that allow you to cache different types of content appropriately.

For instance, if authenticated users all see the same view of your site, it may be possible to cache that view anywhere. If authenticated users see a user-sensitive view of the site that will be valid for some time, you may tell the user’s browser to cache, but tell any intermediary caches not to store the view.

Locations Where Web Content Is Cached

Content can be cached at many different points throughout the delivery chain:

Browser cache: Web browsers themselves maintain a small cache. Typically, the browser sets a policy that dictates the most important items to cache. This may be user-specific content or content deemed expensive to download and likely to be requested again.
Intermediary caching proxies: Any server in between the client and your infrastructure can cache certain content as desired. These caches may be maintained by ISPs or other independent parties.
Reverse Cache: Your server infrastructure can implement its own cache for backend services. This way, content can be served from the point-of-contact instead of hitting backend servers on each request. Each of these locations can and often do cache items according to their own caching policies and the policies set at the content origin.

Caching Headers

Caching policy is dependent upon two different factors. The caching entity itself gets to decide whether or not to cache acceptable content. It can decide to cache less than it is allowed to cache, but never more.

The majority of caching behavior is determined by the caching policy, which is set by the content owner. These policies are mainly articulated through the use of specific HTTP headers.

Through various iterations of the HTTP protocol, a few different cache-focused headers have arisen with varying levels of sophistication. The ones you probably still need to pay attention to are below:

Expires: The Expires header is very straight-forward, although fairly limited in scope. Basically, it sets a time in the future when the content will expire. At this point, any requests for the same content will have to go back to the origin server. This header is probably best used only as a fall back.
Cache-Control: This is the more modern replacement for the Expires header. It is well supported and implements a much more flexible design. In almost all cases, this is preferable to Expires, but it may not hurt to set both values. We will discuss the specifics of the options you can set with Cache-Control a bit later.
Etag: The Etag header is used with cache validation. The origin can provide a unique Etag for an item when it initially serves the content. When a cache needs to validate the content it has on-hand upon expiration, it can send back the Etag it has for the content. The origin will either tell the cache that the content is the same, or send the updated content (with the new Etag).
Last-Modified: This header specifies the last time that the item was modified. This may be used as part of the validation strategy to ensure fresh content.
Content-Length: While not specifically involved in caching, the Content-Length header is important to set when defining caching policies. Certain software will refuse to cache content if it does not know in advanced the size of the content it will need to reserve space for.
Vary: A cache typically uses the requested host and the path to the resource as the key with which to store the cache item. The Vary header can be used to tell caches to pay attention to an additional header when deciding whether a request is for the same item. This is most commonly used to tell caches to key by the Accept-Encoding header as well, so that the cache will know to differentiate between compressed and uncompressed content.

An Aside about the Vary Header

The Vary header provides you with the ability to store different versions of the same content at the expense of diluting the entries in the cache.

In the case of Accept-Encoding, setting the Vary header allows for a critical distinction to take place between compressed and uncompressed content. This is needed to correctly serve these items to browsers that cannot handle compressed content and is necessary in order to provide basic usability. One characteristic that tells you that Accept-Encoding may be a good candidate for Vary is that it only has two or three possible values.

Items like User-Agent might at first glance seem to be a good way to differentiate between mobile and desktop browsers to serve different versions of your site. However, since User-Agent strings are non-standard, the result will likely be many versions of the same content on intermediary caches, with a very low cache hit ratio. The Vary header should be used sparingly, especially if you do not have the ability to normalize the requests in intermediate caches that you control (which may be possible, for instance, if you leverage a content delivery network).

How Cache-Control Flags Impact Caching

Above, we mentioned how the Cache-Control header is used for modern cache policy specification. A number of different policy instructions can be set using this header, with multiple instructions being separated by commas.

Some of the Cache-Control options you can use to dictate your content’s caching policy are:

no-cache: This instruction specifies that any cached content must be re-validated on each request before being served to a client. This, in effect, marks the content as stale immediately, but allows it to use revalidation techniques to avoid re-downloading the entire item again.
no-store: This instruction indicates that the content cannot be cached in any way. This is appropriate to set if the response represents sensitive data.
public: This marks the content as public, which means that it can be cached by the browser and any intermediate caches. For requests that utilized HTTP authentication, responses are marked private by default. This header overrides that setting.
private: This marks the content as private. Private content may be stored by the user’s browser, but must not be cached by any intermediate parties. This is often used for user-specific data.
max-age: This setting configures the maximum age that the content may be cached before it must revalidate or re-download the content from the origin server. In essence, this replaces the Expires header for modern browsing and is the basis for determining a piece of content’s freshness. This option takes its value in seconds with a maximum valid freshness time of one year (31536000 seconds).
s-maxage: This is very similar to the max-age setting, in that it indicates the amount of time that the content can be cached. The difference is that this option is applied only to intermediary caches. Combining this with the above allows for more flexible policy construction.
must-revalidate: This indicates that the freshness information indicated by max-age, s-maxage or the Expires header must be obeyed strictly. Stale content cannot be served under any circumstance. This prevents cached content from being used in case of network interruptions and similar scenarios.
proxy-revalidate: This operates the same as the above setting, but only applies to intermediary proxies. In this case, the user’s browser can potentially be used to serve stale content in the event of a network interruption, but intermediate caches cannot be used for this purpose.
no-transform: This option tells caches that they are not allowed to modify the received content for performance reasons under any circumstances. This means, for instance, that the cache is not able to send compressed versions of content it did not receive from the origin server compressed and is not allowed.

These can be combined in different ways to achieve various caching behavior. Some mutually exclusive values are: no-cache, no-store, and the regular caching behavior indicated by absence of either public and private.

The no-store option supersedes the no-cache if both are present. For responses to unauthenticated requests, public is implied. For responses to authenticated requests, private is implied. These can be overridden by including the opposite option in the Cache-Control header.

Further reading