Apache 2.0 Beta Delayed

Wednesday Dec 20th 2000 by Jeremy Reed
Share:

Last month it looked like the Apache Group was ready to release the first Beta of Apache 2.0. If you have been watching the news coming from the Apache Group, you have probably noticed that the beta has not been released yet. This month's article discusses why the first beta was not released, what has changed since last month, and our current plans for the first beta.

At the end of last month's article, I announced that the Apache Group was ready to release the first Beta of Apache 2.0. If you have been watching the news coming from the Apache Group, you have probably noticed that the beta has not been released yet. This month, I will discuss why the first beta was not released, what has changed since last month, and our current plans for the first beta.

What went wrong

All of the Apache 2.0 alphas have suffered from one problem, mod_include didn't work. Obviously this is a major problem, many web sites rely on Server Side Includes to create dynamic web pages. In Apache 1.3 this module was implemented as a handler, meaning that SSIs had to be stored on disk. With Apache 2.0, it makes more sense to implement mod_include as a filter, this allows SSIs to be implemented as files on disk, the output from CGI scripts, or as the output from any other module. However, re-writing mod_include as a filter turned out to be a non-trivial problem. After weeks of work, mod_include has been completed, and it now works as a filter.

While waiting for mod_include, we also discovered that piped logs, notably reliable piped logs, had a major problem. We were checking to determine if the pipe between Apache and the logging program was still writable. If not, Apache assumed the logging process had died, and restarted it. The problem is that operating systems will report that the pipe is unwritable in multiple situations, and only one of those is that the process has actually stopped reading. For example, we were seeing Apache stop and restart the logging process if the pipe was full. This means that when the piped log process is at its busiest, Apache was stopping and restarting the logging process.

What has changed

mod_include

Because the Apache Group had to wait to release the first beta, we decided to make some major improvements in the interim. The first major improvement was to mod_include. In Apache 1.3 if somebody wanted to add an SSI tag to mod_include they had to actually modify the module directly. This was most obvious for two tags in mod_include, the PERL tag, and the CGI tag. The PERL tag was a hack to allow PERL code inside of Server Side Includes. The biggest problem with this hack, is that it was only available if you also had mod_perl installed. This meant that the features available in mod_include relied on mod_perl, but that link was not obvious. The CGI tag is meant to allow a Server Side Include to specify a CGI script to run. The problem with this is that it requires duplicating a lot of logic between mod_include and mod_cgi.

For example, all of the SuExec logic was duplicated between both modules, as was the RLIMIT logic. This problem became even bigger in Apache 2.0, because there are two CGI modules, mod_cgi and mod_cgid. This means that the same logic is in three places instead of just two. This also creates a configuration problem. Many web sites refuse to allow CGI scripts, because they pose a potential security whole. The same site may want to provide SSIs for simple dynamic sites. Those sites obviously do not include mod_cgi in their installation, but by adding mod_include it is possible to execute CGI scripts if the configuration isn't correct.

This problem is solved by having mod_include implement only the bare minimum of SSI tags. Mod_include also implements a hook to allow other modules to extend its abilities. For example, in Apache 2.0 mod_include does not implement the perl tag. If mod_perl wishes to implement this tag, then it is free to do so. In the near future, mod_include will also stop implementing the CGI tag, as that tag will move to mod_cgi and mod_cgid. This will remove the possibility that a site that doesn't want CGI scripts will accidentally enable them with Server Side Includes. If mod_cgi is not included in the server, then all CGI abilities will be removed.

Modules are able to extend mod_include's abilities by calling ap_register_include_handler. This function accepts a character string, which represents the tag that mod_include is looking for, and a function that implements the tag. Mod_include reads the input passed to it, and parses it searching for the "<!-- " tag that indicates the beginning of an SSI element. It then parses that SSI element to find the actual SSI tag. Once that tag is found, it calls the function that has been registered for that tag. This allows mod_include to implement only the bare minimum, while allowing for a very powerful parsing language. This also allows the Apache developers to combine all of the common code to a single location, making Apache easier to maintain in the future.

Reliable Piped logs

The second major change was to fix the reliable piped log problem. In the past, if Apache determined that the pipe was unwritable, then the process was killed and re-spawned. This was meant to ensure that the program was always available to log the output from Apache. The problem is that at times the pipe is reported as being unwritable when it is just full. If the pipe is full, then the logging program should be using all of its resources to read from the pipe and write the logs to disk. If Apache kills the logging program and restarts it, then some of the logs will be lost, and even worse, the logging program will be busy restarting instead of continuing to process the information it has.

This problem has been solved by not restarting the logging process if the pipe becomes unwritable. The Apache developers decided that if the logging program actually stops responding, then the administrator will need to be responsible for restarting the server, and thus the logging program. The reliable piped logs will still restart the logging process if the program dies for some unexpected reason. The only change to reliable piped logs, is that we no longer try to detect a logging program that has just stopped responding.

Source Code Re-Organized

The final major change is that the source code repository has been completely re-vamped. In the past, the Apache developers have tried to provide a single source tree that was also used for binary installations. Apache 2.0's source tree was exactly the same as the 1.3 tree, but that tree doesn't make as much sense. In Apache 1.3 the source tree was geared towards building a web server, but Apache 2.0 is a server framework with an HTTP module. The new source tree emphasizes the new focus of the Apache developers. Although the protocol independence is not perfect yet, the new layout forces the developers to think about that abstraction.

The new layout also introduces a new Apache Portable Run-time project, apr-util.This is a set of utilities that the Apache developers believe are useful outside of a web server. These functions are portable to every platform that Apache 2.0 runs on, but they are not portability issues. Functions in this library do not abstract out operating system issues, rather they implement useful functions that many different types of programs may find useful. For example, the buckets functions that help to implement filtering are in this library.

Re-organizing the source tree has required the Apache developers to delay the beta for a few weeks, so that all of the problems could be worked out of this tree before we tried to release a working server based on it. One alpha release, alpha 9, has already been released based on this tree, which has allowed the developers to find most, if not all, of the issues related to the new source tree. The current Apache 2.0 development tree is no longer stored in the apache-2.0 CVS repository. That repository is the old 2.0 tree, the new tree is stored in httpd-2.0. The instructions for checking out the current Apache 2.0 development tree can be found at the Apache development site.

Where are we going from here

The Apache Group has decided to try for the first Beta on December 22nd. There is no guarantee that this will actually happen however, because there are still many changes being made to the Apache 2.0 tree. One of the changes that the developers are still working on adding IPv6 support to Apache 2.0 for platforms that support it. The developers have made great progress on this front, and ApacheBench, a simple benchmarking utility that is distributed with Apache, currently supports IPv6. In fact, IPv6 support is being added to APR, so that any APR program will be capable of supporting IPv6 easily.

There are a couple of other issues that the Apache developers will need to resolved before Apache 2.0 can be released as a beta, but the developers are confident that those can be resolved before the 22nd. If the beta does not roll on the 22nd, then you can be sure that the developers will be working hard to get it out the door as soon as they can. However, the Apache developers are very strong believers that we can not release Apache 2.0 before it is ready. Doing so would be a disservice to the people who entrust their web sites to Apache.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved