HTTP compression is a simple way to improve site performance and decrease bandwidth, with no configuration required on the client side. Find out how it works, and how to configure Apache and IIS to compress data on the fly.
There's a finite amount of bandwidth on most Internet connections, and anything administrators can do to speed up the process is worthwhile. One way to do this is via HTTP compression, a capability built into both browsers and servers that can dramatically improve site performance by reducing the amount of time required to transfer data between the server and the client. The principles are nothing new — the data is simply compressed. What is unique is that compression is done on the fly, straight from the server to the client, and often without users knowing.
HTTP compression is easy to enable and requires no client-side configuration to obtain benefits, making it a very easy way to get extra performance. This article discusses how it works, its advantages, and how to configure Apache and IIS to compress data on the fly.
Most user's knowledge of compression is from compressing a group of files that they download, extract, and open. But compression can also be used passively to compress documents as they are being transferred to a client's browser. Because it's a passive process, the server can reduce the size of the pages sent, therefore reducing the download time for users and their bandwidth usage.
Working the numbers helps clarify the gains. You can typically reduce an HTML document to less than half of its original size. This, in turn, halves the amount of time the client needs to download the page as well as the amount of bandwidth required. All of this is achieved without actually changing the way the site works, its page layout, or the content. The only thing that changes is the way the information is transferred.
Unfortunately, there are limitations.
Suitable File Types
Not all files are suitable for compression. For obvious reasons, files that are already compressed, such as JPEGs, GIFs, PNGs, movies, and 'bundled content (e.g., Zip, Gzip, and bzip2 files) are not going to compress appreciably further with a simple HTTP compression filter. Therefore, you are not going to get much benefit from compressing these files or a site that relies heavily on them.
However, sites that have a lot of plain text content, including the main HTML files, XML, CSS, and RSS, may benefit from the compression. It will still depend largely on the content of the file; most standard HTML text files will compress by about a half, sometimes more. Heavily formatted pages, for example those that make heavy use of tables (and therefore repetitive formatting content) may compress even further, sometime to as little as one-third of the original size.
Fortunately, with most HTTP servers it's possible to select which types of files are compressed so the effects of trying to compress non-compressable data is limited.
>> Enabling HTTP Compression
Enabling HTTP Compression
HTTP Compression is a function of the server, but the browser automatically supports it without any additional configuration on the client. To start gaining the benefits of compressed content, simply enable compression on the server.
The way to do this differs among Apache, IIS6, and previous version of IIS.
Apache 2.0 comes with the mod_deflate, which adds a filter to Gzip the content (don't let the term "deflate" fool you, though) content. Filters can be blanket — in Internet Explorer everything is compressed — or selective — compressing only specific MIME types (determined by examining the header generated, either automatically by Apache or a CGI or other dynamic component.
To enable blanket compression, set the SetOutputFilter directive to a Web site or Directory container, for example:
To enable compression on specific MIME types, use the AddOutputFilterByType directive, for example:
AddOutputFilterByType DEFLATE text/html
Note that this compresses all output with this MIME type, so if the CGI code or other applications (e.g., Tomcat or mod_perl) are generating the right HTTP content header as part of their operation, the output will be compressed as well.
Some browsers, particularly older ones, may not work correctly with compressed formats. Compression filtering for specific browsers is disabled using the BrowserMatch directive. Check out the mod_deflate documentation for more information.
IIS 6 includes a native compression system that is easy to use and deploy. Because it's built-in, rather than operated by an ISAPI filter, it is very fast and has basically put the previous commercial alternatives available for IIS 5 and earlier out of business. The compression system can be configured to compress both static content and dynamic content (i.e., scripted output). IIS 6 also caches the compressed information in a directory, which helps improve the performance for both static and script-based responses by eliminating the need to compress already compressed content.
To enable HTTP compression in IIS 6, open the Web site's property page to edit the global properties for the site. Change to the Service tab, and configure the options within the HTTP Compression section.
Figure 1 is a sample of a window.
Setting HTTP Compression in IIS 6
Cached files are stored in the Temporary Directory. The default is within a suitably named directory within the IIS metadata directory. The directory selected must be on an NTFS partition. You can limit the size of the cache or leave it unlimited; we recommend setting a limit of about two-times the maximum size of the site (including the data it might generate from a scripted page).IIS 5 and Earlier
There is no built-in compression for IIS versions prior to version 6, but ISAPI filters are available. All of these are understandably slower than the built-in facility in IIS 6. We recommend using the commercial ISAPI filter ZipEnable from Port80 Software.
Others are available, but ZipEnable is one of the few packages Microsoft specifically recommends. It is also compatible with IIS 6 and can further control the compression on a directory-by-directory level by editing the IIS 6 metabase configuration for you.
>> Effects on Performance, Browser Support
Effects on Server Performance
Compressing content obviously requires a certain amount of CPU time to compress the information for transfer. This compression process can have a detrimental effect on the site because each object selected must be compressed before sending. This is one area where IIS 6 has a leg up on Apache, as it creates a cache directory into which compressed files are kept, reducing the CPU load for frequently accessed pages. What it doesn't do is completely eliminate the need for inline compression. Heavily dynamic content must still be compressed on the fly, and there may be a finite amount of cache storage.
To be honest, it's unlikely that the loading in any test-heavy Web site will exceed the benefit obtained by using the compression. Savings of 50 percent in bandwidth deliver a significant site performance increase by sacrificing less than 10 percent (and often less than 1 percent) of CPU time. That's more than worth it, especially for enterprise paying by the megabyte for transfers.
Browser Support and Dynamic Content
Today, most browsers support some kind of compression, but the exact type of compression supported is browser dependent. This is not an issue you need to worry about; the Web server will send compressed documents only if the browser indicates it supports them. It is, however, worth examining the mechanism and looking at some of the supported compression types of different browsers.
A browser supplies, as part of its URI request, the compression formats it supports through the Accept-Encoding HTTP header. Apache (and others) make this information available through the HTTP_ACCEPT_ENCODING environment variable. Again, you don't actually need to do anything; Apache will automatically encode content appropriately if it identifies the browser can accept it through the HTTP header.
The table below lists various browsers and the encodings they support.
Browser Support for Compression Encodings
|Firefox 1.0, Mozilla 1.x, Camino
||bzip2, gzip, deflate|
"Identity" is sometimes listed as a type. This means the browser supports uncompressed content (which is basically implied, so not all browsers explicitly state this).
As the list indicates, modern browsers support compression. Thus, clients don't need to do anything to actually use compressed content; the browser automatically supplies its supported encodings when it makes a request. So, to get a speed improvement you need configure only the server.
Using HTTP compression is a very simple way to improve site performance and decrease bandwidth by doing very little. There are potential downsides due to the additional CPU overheads required to support it, but they are relatively minor tradeoffs in comparison to the potential benefits.
And if it doesn't result in significant improvements? Disabling it is just as simple because the content of the site has not been modified, only the way the content is transferred.
This article was originally published on Thursday Jun 23rd 2005