So you've set up your server and users are accessing your Web site; the last thing you want are performance problems with the site. If only you could have tested for them before going live. With the Flood component of the Apache HTTP Project's HTTPD-Test (so named because it floods an HTTP server with requests to test its response times) you can. Martin Brown explains how to install and configure Flood, and offers some real world scenarios.
Once you've set up your server and users are accessing your Web site, the last thing you want to hear about are performance problems with the site. You can test the
system manually, but there are limitations to manual-based testing.
One major downside of manual testing (aside from the time investment) is that it doesn't reveal where the real problem with the site lies. Is it a configuration problem
with the server, a problem with some dynamic elements, or a more fundamental network
The Apache HTTP Project includes a sub-project called HTTPD-Test. As the name
suggests, it's a test suite for Apache and HTTP in general. The suite contains a number of different elements, and this article will focus on the one known as Flood.
Flood is so named because it is used to flood an HTTP server with requests to test its response times.
Flood uses an XML document with the necessary settings -- URLs and optional POST
data -- to send requests to a given server or range of servers. Flood then measures the time it takes to:
- Open the socket to the server
- Write the request
- Read the response
- Close the socket
With these four criteria being measured, administrators can identify whether the problem is with the Apache configuration (or any other HTTP server), the sheer load and performance of the hardware, or a network bottleneck.
You can download the packages -- httpd-test and apr/apr-util -- required for
building from the CVS server at Apache. You'll need to log in first though (the password is
$ cvs -d :pserver:firstname.lastname@example.org:/home/cvspublic login
$ cvs -d :pserver:email@example.com:/home/cvspublic co httpd-test/flood
$ cd httpd-test/flood
$ cvs -d :pserver:firstname.lastname@example.org:/home/cvspublic co apr
$ cvs -d :pserver:email@example.com:/home/cvspublic co apr-util
Once you have the source, you must build and compile the application using:
$ make all
You're now ready to go!
Flood is configured through an XML file that is used to define the parameters for
testing the Web site. When testing, Flood uses a profile, which defines how
a given list of URLs are accessed. Requests are generated by one or more farmers that are, in turn, members of one or more farms. You can see this more clearly in the illustration
As illustrated in the graphic, we have one farm, which specifies two sets of five farmers. Farmer Joe
uses ProfileA and a list of five URLs, and farmer Bob uses ProfileB with a list of three URLs. The farmers request the URLs directly from the Web server. Flood uses threads to create the farmers, and then collates the information the farmers collect into a single data file for later processing.
The XML file contains definitions for four main elements: the URL lists, profiles, farmers and farms.
The URL list is just that, a list of URLs to be accessed. URLs can be straight requests or specific types (GET, HEAD, and POST types are supported, as is the capability to supply data accordingly for dynamic driven sites).
Profiles define which URL list to use, how they should be accessed, what type of socket to use, and how the information should be reported.
Farmers are responsible for the actual request process. The only configurable elements are the profile to use and the number of times to process the profile. Profiles are executed sequentially by each farmer but can be repeated, so you would end up accessing, for example, urla, urlb, urla, urlb, and so on.
Farms specify the number of farmers to create and when. By increasing the number of
farmers created by a farm, the number of simultaneous requests is increased. Additional
settings enable you to create a number of initial farmers, and then increase that number at regular intervals. For example, you could initially create two farmers, then add a new farmer every five seconds up to a maximum of 20. Depending on your URL list and server performance, this could result in a slow rise to 20 simultaneous accesses for a given period, and then a slow fall back to zero. Alternatively, it could give the effect of a regular number of users accessing a set number of pages for a longer duration, with peaks of five or six simultaneous requests.
Note: The current version of Flood supports only one Farm, and it must be called 'Bingo'. You can, however, specify multiple farmer definitions within the single farm, which achieves the same basic effect.
By tuning the farm, farmer, and URL list parameters you can control the number of
requests, simultaneous requests, overall duration (as a function of the URL list, repeat
count and number of farmers) and how the requests are spread over the duration of the
test. This allows you to very specifically test for different situations.
The three basic (and rough) rules of configuration with Flood to remember are:
- The URL list defines what your farmers visit.
- The repeat count for a farmer defines a number of users accessing your site.
- The farmer count for a farm defines the number of simultaneous users.
A sample configuration for Flood is in the examples folder that came with the distribution; round-robin.xml is probably the easiest to one to start with. This article, however, will not discuss the specifics of editing the XML, or even processing
the data file generated.
Instead, we will examine how to tune the parameters to test different types of Web
sites. To help understand the implications of the next section, here's a quick look at the results of the analyze-relative script from the examples directory. In this case it shows the results of a test on an internal server:
Slowest pages on average (worst 5):
Average times (sec)
connect write read close hits URL
0.0022 0.0034 0.0268 0.0280 100 http://test.mcslp.pri/java.html
0.0020 0.0028 0.0183 0.0190 700 http://www.mcslp.pri/
0.0019 0.0033 0.0109 0.0120 100 http://test.mcslp.pri/random.html
0.0022 0.0031 0.0089 0.0107 100 http://test.mcslp.pri/testr.html
0.0019 0.0029 0.0087 0.0096 100 http://test.mcslp.pri/index.html
Requests: 1200 Time: 0.14 Req/Sec: 9454.08
From these results you can see the average connect, write (request), read (response), and close times for a single page. You also get a basic idea of the number of requests handled per second by the server.
Testing 'News' Style Web Sites
The majority of news-style Web sites -- New York Times, Slashdot,
ServerWatch, and even many blog sites -- have a main index page, which everybody
accesses, that contains the links to the main stores and typically one or more
'story' pages when a user deems the story interesting enough to read in full.
In general, this results in a fairly steady stream of people hitting the main page and a variable number hitting specific other pages. If a site publishes an RSS/RDF feed it will also see a fair number of accesses directly to story pages without first viewing the home page. Nearly all of these types of site use dynamic elements, and using Flood is also a good way to test the dynamic performance of your site, especially if you can compare that to a raw HTML-based response.
You can simulate the news style requests through Flood by using the following settings:
||Farmer Set A
||Farmer Set B
||Farmer Set C
||Homepage +3 stories
||3 story pages
||Home page only
||Stories only (RSS)
Testing Shopping Sites
Shopping sites, online product catalog, and other more interactive sites have a different usage profile. Although some people will hit your home page, some will come in directly to another page within your site. Most users will also spend more time browsing around the site -- they'll look at a product page for a number of products, perhaps do some searches and even click through to other related or similar products.
Thus, you should test with a higher number of URLs in the list, larger repeat counts (to simulate a larger number of users), and lower simultaneous access than with news site:
Testing the "Slashdot" Effect
Occasionally, a situation will arise where you must check how your system copes
with thousands of users trying to access the site at the same time. Many sites have
already experienced the "slashdot" effect, whereby a mention of a particular page on the
Slashdot Web site (www.slashdot.org) results in these large, simultaneous, requests.
Typically these requests are for only one page, and we can test for that with Flood by creating hundreds, or thousands, of farmers simultaneously accessing the server for just that one page. To simulate the rapid sequential access by a number of readers over a period of time, set a high repeat count and use the delay system to ramp up the requests to their high point.
Tips for Testing
For any of these tests to work properly, you must keep a few things in mind:
- First and foremost, don't run Flood from the same machine as your HTTP server.
If you do this you're only testing a single machine's capability to open network
sockets and communicate with itself. Test from another machine on the network.
- Keep in mind the technical limits of the machine you test from. Your machine is
capable of only so many threads and network sockets; trying to create too many
farmers may result in inefficient and misleading testing.
- Flood is a client-side solution, so nothing is stopping you from executing the
tests simultaneously on multiple machines. In fact, we recommend that, as it's the only
way to reliably flood test a server to the point of saturation before reaching the
client's own limits.
- Flood is reliant on the client host for performance. If the client test machine is
busy, Flood will have just as many hurdles as any other application in terms
of processing requests. In dedicated Web farm environments I've set a single
machine whose sole responsibility is testing. If you are using a machine normally
devoted to other tasks, shut them down before starting Flood.
Future articles will examine ways to summarize the report information and how to test
more complex sites and servers.