When your equipment is not in tune with demand, your customers bail out and hit the road to a new service. In the world of Internet business, your equipment is your site. Every second is precious to your customer. What would be your reaction to a site that comes up so slow that you feel your time is wasted just trying to visit it? You would rather use your time to visit another site that loads faster. The same would be the reaction of your customer, when he sees that your site loads too slow. So, that’s the question:
Does your website load quickly enough?
Being in the support industry, I often see the following quotes:
My sites are slow.......
Site running slow.....
I was wondering, the site is running extremely slow! ..... etc etc
And once a customer asked me whether it’s possible to increase site speed using “http” headers.
Answer is Yes
Assuming that your site is hosted in a reasonably fast server and have a
good uptime, optimizing site codes is the most important factor in improving site speed. Other than that, another method is to use server files which are accessible by end users.
One such file is .htaccess. This file can be used to improve your site’s loading time greatly, if mod_expires and mod_headers are compiled with
Apache. Apart from this, there is one more simple method; gzip compression (using htaccess). Both are described in this article.
Obviously, these two techniques can also be added to “Htaccess Uses” list.
.htaccess, commonly known as “end user’s apache configuration file”, is used to customize configurations for a particular directory using the directives provided by apache.
This file is only used when users don’t have access to the main server configuration file. You can also give another name to htaccess file. If you want to call your “htaccess” file by another name, you can do it by using “AccessFileName” directive. For example:, if you wish to call .htaccess by the name bobcares, it can be done by using the following directive.
You can learn more about htaccess here.
How Caching increases site speed?
Caching is a temporary storage of frequently accessed data closer to client browser, avoids round trips to original server and thereby saves time, reduces bandwidth consumption and server load. The basic idea behind caching is simple. Instead of wasting efforts by re-downloading a resource every time it is needed, keep a local copy, and reuse it for as long as it is still valid.
“A Web cache sits between one or more Web servers (also known as origin servers) and a client or many clients, and watches requests come by, saving copies of the responses - like HTML pages, images and files (collectively known as representations) - for itself. Then, if there is another request for the same URL, it can use the response that it has, instead of asking the origin server for it again.”
Freshness and validation are the two main concepts in web caching. Freshness refers whether a cached representation is in the same state, as, that resource on the origin server. Validation information is used by servers and caches to communicate when a representation has changed.
If server confirms that the cached object is still fresh, browser will “use” it, otherwise a fresh copy is served. In this way, servers “tell” cache how long the associated information/representation remains fresh.
Let’s see how cache works:
Caching is based on a set of rules. Most of these rules are determined by protocols and some of these rules are set by cache admin. Some of the common rules are:
- If the request is authoritative or genuine or secure, it won’t be cached.
- If the response header do not contain any validators (eg:- ETag or Last-Modified header) or freshness information, it will be considered uncacheable.
- A cached representation is deemed as fresh, if an expiry time or other age-controlling header set is defined, and is still within the fresh period.
- If the cached content is stale, the origin server will be asked to validate it.
Ways to Implement Caching:
Here are two awesome ways to implement caching on your website using Apache .htaccess file. Both methods are extremely simple to set up and will speed up your sites!
1. Using mod_expires
Requirements : mod_expires and mod_headers must be compiled with apache (static/dynamic).
Using mod_expires we can set life time for pages served or for contents in web pages. By this way, servers tell cache, how long the associated information/representation remains fresh. After this period, contents will be requested from origin. So it basically sets a time for web pages. This method is excellent for pages that change at known times or if they change very rarely. The only value valid in expire headers is the “Date” which is in GMT, not local time.
Eg: Expires: Sun, 25 Jun 2006 14:57:12 GMT
How can this be done using htaccess?
We can target files by their extensions
ExpiresDefault "access plus 1 year"
or by their type
ExpiresDefault "access plus 1 month"
ExpiresByType text/html "access plus 1 month 15 days 2 hours"
ExpiresByType image/gif "modification plus 1 month"
ExpiresByType image/png "modification plus 1 month"
ExpiresByType image/jpg "modification plus 1 month"
Here “ExpiresDefault” defines the default expiry time for all files that are not specified separately using “ExpiresByType”.
You may also use formats “ExpiresDefault A300″, “Expires A300″, Expires M300″ etc. Here the first two sets the expiry time to 300 seconds after access (A) and the last one sets expiry time to 300 seconds after modification.
Although this seems useful, there are some limitations in using “Expires” headers. First thing is that clocks on web server and cache must be synchronized. Also it’s easy to forget that we’ve set some content to expire at a particular time.
Due to these limitations HTTP 1.1 introduced a new type of header known as cache-control header.
2. Using Cache-control Headers
Here are some of common Cache-control Headers.
- max-age — Maximum amount of freshness time (in seconds).
- no-cache — When this is set, caches contact origin server for validation before releasing cached copy.
- public — Server responses are always cached even if it’s behind authentication.
- private — All or a part of server responses are intended for a particular use and must not be cached by a shared cache.
- no-store — Tells cache not to keep a copy of representation.
- must-revalidate — Caches must follow every freshness information given to them by the server.
Let’s see how this can be implemented using .htaccess
Header set Cache-Control "max-age=604800, public"
All html files for two hours.
Header set Cache-Control "max-age=7200, public"
No caching for php, perl and cgi scripts.
Header set Cache-Control "no-store, no-cache, must-revalidate, max-age=0"
Also we can use “Expires” along with “cache-control”. Let’s modify the above.
Header set Cache-Control "no-store, no-cache, must-revalidate, max-age=0"
Header unset Cache-Control
Header unset Expires
Suppose you want some of the files say mp3 and mp4 files to be cached for ever, use
Header set Cache-Control "max-age=31536000, public"
31536000 == 1 year (effectively infinite on Internet time).
So that’s all about caching and site speed. Now let’s see how we can speedup sites using compression.
Improving site-speed using Compression (mod_gzip)
Requirements : mod_gzip must be compiled with apache (static/dynamic).
Mod_gzip is an apache module which compresses the contents before sending it to the client browser. It uses same compression as gzip and no plugins or additional softwares are needed by the browser to take advantage of this. In this way less content is transferred which increases the download time and hence saves bandwidth.
Time taken to compress the contents, transfer to client and restore these at client end is faster than transferring original uncompressed files across the wire. Mod_gzip can be compiled with apache as static or dynamic module.
When client sends a request, apache determines whether it should use mod_gzip by checking whether “Accept-Encoding” HTTP request header has been sent by the client. If the request contains something like “Accept-encoding: gzip”, mod_gzip will compress all configured file types when they are served to clients. Here client announces to apache that it can understand files that have been encoded in “gzip” format. mod_gzip then compress the outgoing contents and include following response headers:
This means that content from the server is GZIP-encoded but after uncompressing, it should be treated as HTML files. This type of compression can be used for static files, dynamic pages such as those produced by Server-Side Includes (SSI). You can also use this type of compression for your Cascading Stylesheets(CSS).
Now let us see how gzip compression can be enabled via .htaccess:
The following code explicitly states the following:
- All text files (text/css, text/html etc.) and php files will be compressed.
- All PDF documents will be compressed.
mod_gzip_item_include file \.html$
mod_gzip_item_include file \.htm$
mod_gzip_item_include file \.shtml$
mod_gzip_item_include mime ^text/html.*
mod_gzip_item_include file \.css$
mod_gzip_item_include mime ^text/css.*
mod_gzip_item_include file \.php$
mod_gzip_item_include file \.pdf$
mod_gzip_item_include mime ^application/pdf.*
mod_gzip_item_exclude file \.js$
Other commonly used directives are:
mod_gzip_minimum_file_size -- Minimum size ( in bytes ) of a file eligible for compression.
mod_gzip_maximum_file_size -- Maximum size ( in bytes ) of a file eligible for compression.
mod_gzip is for Apache 1.3. If you’re using Apache 2.x, you’ll need to use mod_deflate which is not included here.
Pros and Cons:
Since it compresses the contents prior to transfer, lot of bandwidth is saved resulting in faster download times. Downside is that it will create additional load on server. Also some browsers still have trouble with compressed contents.
Kewl! If you’ve made it this far, you now know the easiest methods to optimize your site.
Generally speaking, caching and compression are two simple techniques which can be used to improve site’s performance. Expired header denotes a point in time after which the representation should be considered out of date (stale). In such cases Cache-Control header takes precedence.
By caching objects that change infrequently for longer periods, and caching frequently-updated content for shorter periods (or not at all) you can speed up perceived load times while maintaining fresh content.
About the author: Joseph Cecil joined Bobcares in 2006 and has been with Bobcares since then. He is specialized in linux server administration, especially cPanel servers. Apache and shell scripting are his major areas of interest.