Understanding ETags

 curl -I http://site.com --header 'If-None-Match: "0f46357eafa5c719e8e3bf277a993e07"'


Configure ETags

Entity tags (ETags) are a mechanism that web servers and browsers use to determine whether the component in the browser's cache matches the one on the origin server. (An "entity" is another word a "component": images, scripts, stylesheets, etc.) ETags were added to provide a mechanism for validating entities that is more flexible than the last-modified date. An ETag is a string that uniquely identifies a specific version of a component. The only format constraints are that the string be quoted. The origin server specifies the component's ETag using the ETag response header.
      HTTP/1.1 200 OK
Last-Modified: Tue, 12 Dec 2006 03:03:59 GMT
ETag: "10c24bc-4ab-457e1c1f"
Content-Length: 12195

Later, if the browser has to validate a component, it uses the If-None-Match header to pass the ETag back to the origin server. If the ETags match, a 304 status code is returned reducing the response by 12195 bytes for this example.
      GET /i/yahoo.gif HTTP/1.1
Host: us.yimg.com
If-Modified-Since: Tue, 12 Dec 2006 03:03:59 GMT
If-None-Match: "10c24bc-4ab-457e1c1f"
HTTP/1.1 304 Not Modified

The problem with ETags is that they typically are constructed using attributes that make them unique to a specific server hosting a site. ETags won't match when a browser gets the original component from one server and later tries to validate that component on a different server, a situation that is all too common on Web sites that use a cluster of servers to handle requests. By default, both Apache and IIS embed data in the ETag that dramatically reduces the odds of the validity test succeeding on web sites with multiple servers.
The ETag format for Apache 1.3 and 2.x is inode-size-timestamp. Although a given file may reside in the same directory across multiple servers, and have the same file size, permissions, timestamp, etc., its inode is different from one server to the next.
IIS 5.0 and 6.0 have a similar issue with ETags. The format for ETags on IIS isFiletimestamp:ChangeNumber. A ChangeNumber is a counter used to track configuration changes to IIS. It's unlikely that the ChangeNumber is the same across all IIS servers behind a web site.
The end result is ETags generated by Apache and IIS for the exact same component won't match from one server to another. If the ETags don't match, the user doesn't receive the small, fast 304 response that ETags were designed for; instead, they'll get a normal 200 response along with all the data for the component. If you host your web site on just one server, this isn't a problem. But if you have multiple servers hosting your web site, and you're using Apache or IIS with the default ETag configuration, your users are getting slower pages, your servers have a higher load, you're consuming greater bandwidth, and proxies aren't caching your content efficiently. Even if your components have a far future Expiresheader, a conditional GET request is still made whenever the user hits Reload or Refresh.
If you're not taking advantage of the flexible validation model that ETags provide, it's better to just remove the ETag altogether. The Last-Modified header validates based on the component's timestamp. And removing the ETag reduces the size of the HTTP headers in both the response and subsequent requests. This Microsoft Support article describes how to remove ETags. In Apache, this is done by simply adding the following line to your Apache configuration file:
      FileETag none

Entity Tag Cache Validators

The ETag response-header field value, an entity tag, provides for an "opaque" cache validator. This might allow more reliable validation in situations where it is inconvenient to store modification dates, where the one-second resolution of HTTP date values is not sufficient, or where the origin server wishes to avoid certain paradoxes that might arise from the use of modification dates.
Entity Tags are described in section 3.11. The headers used with entity tags are described in sections 14.19, 14.24, 14.26 and 14.44.

An ETag to entity tag, is a part of HTTP protocol, the protocol of the World Wide Web. ETag is the one of the several mechanisms for cache validation and instruct browser to make conditional request.

This allow caches to be more efficient and saves the bandwidth, as web server does not need to provide the full response header if the file was not changed.
An ETag is an opaque identifier assigned by the web server to each resource of the page at any URL. If resource files changes then web server will assign the new ETag for that resource.
Used in this manner ETags are similar to fingerprints, and they can be quickly compared to determine if two versions of a resource are the same or are different.
Comparing ETags only makes sense with respect to one URL—ETags for resources obtained from different URLs may or may not be equal and no meaning can be inferred from their comparison.

Deployment risks

The use of ETag is and options ( not mandatory as with some other HTTP Header ). The method by which ETags are generated is never been declared in HTTP protocol specification.

Strong and weak ETag validation

The ETag mechanism supports both strong validation and weak validation. They are distinguished by the presence of an initial “W/” in the ETag identifier, as:
“123456789″ — A strong ETag validator
W/”123456789″ — A weak ETag validator
In normal usage, when URL retrieve, Web Server will return the resources along with the corresponding ETag value for that resource, which will be placed in HTTP Response Header E-tag Field:
ETag: “686897696a7c876b7e”
The client will catch those resources along with the ETag value. Later if client will request the same URL again, the client will pass the saved ETag for the particular resource in “If-None-Match” header. which is looks like below:
If-None-Match: “686897696a7c876b7e”
On subsequent request web server will check the Browser cached ETag value which is found in If-None-Match header and browser’s resource ETag. If both ETag values matched then web server simply give the 304 Not Modified Status code which tell the browser to load the content from the cache. And if ETags are not matched then browser will return whole response along with new ETag value for that resource.
To enable the ETag you can use the below code in .htaccess file.
FileETag MTime Size
ExpiresActive on
ExpiresDefault “access plus 1 year”
ETags can be disabled by placing the below code in .htaccess file.
Header unset ETag
FileETag None
ETags may be flushed by clearing the browser cache (but browser implementations may vary).

