Filtering I/O in Apache 2.0: Part 2

Monday Oct 23rd 2000 by Ryan Bloom
Share:

Ryan Bloom continues his exploration of filtered I/O by passing along a simple filter to add a header and/or footer to every page that the server sends.

Last month, I reviewed some of the basic concepts surrounding filtering in Apache 2.0. This month, I will continue to explore filtered I/O by writing a simple filter to add a header and/or footer to every page that the server sends. This filter will not explore all of the power of the filtered design in 2.0, but it is a good start.

Rather than just review the filtering code, I am going to go over the entire module, so that we can all see how everything fits together. The idea behind this filter is that many sites want to add a small piece of text to the top or bottom of every page that is served, with Apache 1.3, this is very difficult to do, but with filtering this becomes very simple. This module will actually offer two methods for adding text to the top and bottom of each page. The first option is to use a file from the disk, and the second is to specify a text string that will be inserted directly into the page.

This module starts with a module specific data structure that is used to store the name of the header and footer files:

typedef struct header_footer_rec {
    const char *headerf;
    const char *headert;
    const char *footerf;
    const char *footert;
} header_footer_rec;                                                               

This structure stores the names of the files and the text that are to be added to each page. For simplicity, this module will allow people to add both text and a file to the same page. The next step in creating this module, is to allow administrators to define files and strings to be added. This is done with a command table and four functions as folows:

static const char *add_header_file(cmd_parms *cmd, void *dummy, const char *arg)
{
    header_footer_rec *d = dummy;
 
    d->headerf = apr_pstrdup(cmd->pool, arg);
    return NULL;
}
 
static const char *add_footer_file(cmd_parms *cmd, void *dummy, const char *arg)

{
    header_footer_rec *d = dummy;
 
    d->footerf = apr_pstrdup(cmd->pool, arg);
    return NULL;
}
 
static const char *add_header_text(cmd_parms *cmd, void *dummy, const char *arg)
{
    header_footer_rec *d = dummy;
 
    d->headert = apr_pstrdup(cmd->pool, arg);
    return NULL;
}
 
static const char *add_footer_text(cmd_parms *cmd, void *dummy, const char *arg)
{
    header_footer_rec *d = dummy;
 
    d->footert = apr_pstrdup(cmd->pool, arg);
    return NULL;
}
 
static const command_rec dir_cmds[] =
{
    AP_INIT_TAKE1("FooterFile", add_footer_file, NULL, ACCESS_CONF || OR_FILEINFO,
                  "a file name"),
    AP_INIT_TAKE1("HeaderFile", add_header_file, NULL, ACCESS_CONF || OR_FILEINFO,
                  "a file name"),
    AP_INIT_TAKE1("FooterText", add_footer_text, NULL, ACCESS_CONF || OR_FILEINFO,
                  "a file name"),
    AP_INIT_TAKE1("HeaderText", add_header_text, NULL, ACCESS_CONF || OR_FILEINFO,
                  "a file name"),
    {NULL}
};                                                                                 

That's all of the bookkeeping we need to do in order to make the module work. Now, we can get down to the filter itself. The first step in creating the filter is to register the function. This is done in the register_hook phase, where we announce to Apache that there is a hook known as HEADERFOOTER that is a content filter and is implemented with the hf_filter function:

static void hf_register_hook(void)
{
    ap_register_output_filter("HEADERFOOTER", hf_filter, AP_FTYPE_CONTENT);
}

Now, we have to actually write the filter function. This is done in a few steps. The first part of the filter is the filter context structure. This structure will be stored in the ctx pointer in the ap_filter_t structure. In our module, this is:

typedef struct hf_struct {
    int state;
} hf_struct; 

The only field in this structure is a state field. This tells our filter if it has already sent the configured headers or not. This filter is likely to be called multiple times for a single request, so we want to be sure that we don't send headers on any call other than the first call. The footers are managed completely differetly, but we will see that later. Most filters will also need a bucket_brigade in this structure if they want to save any of the data they are passed for use in later calls. In our case, this filter is a simple pass-through. Any data we are passed is passed on to the next filter without any modifications.

The next step declares the filter and some variables that we are going to need:


static int hf_filter(ap_filter_t *f, ap_bucket_brigade *bb)
{
    hf_struct *ctx = f->ctx;
    header_footer_rec *conf;
    ap_bucket *e;
    conf = (header_footer_rec *) ap_get_module_config(f->r->per_dir_config,
                                                      &hf_module);
 
    if (ctx == NULL) {
        f->ctx = ctx = apr_pcalloc(f->r->pool, sizeof(*ctx));
    }                                                                              

The first variable is an instance of the filter's ctx structure that we just defined. This is value is always stored in f->ctx, so we start by finding the ctx pointer from the previous call to this filter. If we have never been called before this field will be NULL and we will have to allocate memory for this structure. This is what we are doing in the last three lines of this section. The conf variable is the module specific configuration for this request. This is where we have stored the name of the header and footer files as well as the configured header and footer text. We will use this structure when we determine what to send. Finally, the variable e is a pointer to a bucket. Remember from last month that all of the data is stored in buckets. We will use this variable to create the text buckets to pass to the next filter, and to examine the data that we are sent.

The next step is to send the header information on the first call to this filter:

    if (ctx->state == 0) {
        ctx->state = 1;
        if (conf->headerf) {
            request_rec *rr;
 
            rr = ap_sub_req_lookup_uri(conf->headerf, f->r);
            if (rr->status == HTTP_OK) {
                ap_run_sub_req(rr);
            }
        }
        if (conf->headert) {
            e = ap_bucket_create_immortal(conf->headert, strlen(conf->headert));
            AP_BRIGADE_INSERT_HEAD(bb, e);
        }
    }                                                                              

The first if statement checks to ensure that we haven't already sent the headers. If we haven't already sent them, then we need to find the information to send and send it. If it is a file, then we want to use a sub-request. This takes advantage of Apache's request processing to send the file for us. This file will also be filtered if any filters are configured for it. The best way to think of a sub-request in Apache, is that it is the same as requesting the file through a browser. The ap_run_sub_req function will actually send the file to the client, so once we have called this function we do not need to do anything else to finish this portion of the request.

If we are using a configured string as the header, then we need to create a bucket to store the string before we can pass it down to the next filter. In this case, we are going to use an immortal bucket because the memory for the string was allocated earlier, and we don't want to copy it again if we can help it. Once we have created the bucket, we have to actually put it in the brigade. If we don't put it in the brigade then none of the later filters will get to see the data. Since the final filter is the thing that actually sends the data to the client, we need to be sure that the data is sent to the next filter. Actually sending the data is done later in the function, as we will see below. It is also important that this bucket goes at the beginning of this brigade. Putting this bucket at the start of the brigade is accomplished with AP_BRIGADE_INSERT_HEAD.

The final step in this filter is to deal with the footers:


    e = AP_BRIGADE_LAST(bb);
    if (AP_BUCKET_IS_EOS(e)) {
        ap_bucket_brigade *end;
 
        end = ap_brigade_split(bb, e);
        if (!AP_BRIGADE_EMPTY(bb)) {
            ap_pass_brigade(f->next, bb);
        }
        ap_brigade_destroy(bb);
        bb = end;
        if (conf->headerf) {
            request_rec *rr;
 
            rr = ap_sub_req_lookup_uri(conf->footerf, f->r);
            if (rr->status == HTTP_OK) {
                ap_run_sub_req(rr);
            }
        }
        if (conf->footert) {
            ap_bucket *foot;
 
            foot = ap_bucket_create_immortal(conf->footert, strlen(conf->footert));            
            AP_BUCKET_INSERT_BEFORE(e, foot);
        }
    }
    ap_pass_brigade(f->next, bb);
    return APR_SUCCESS;
}                                                                                  

As I mentioned above, determining when to send the footers is done differently than the headers. In this case, we only want to send the footers when we know that we are at the end of the response. Filters know when they are seeing the end of a response, because they are given an EOS bucket. An EOS bucket is a special bucket created by Apache to signify the "end of stream". Another feature of EOS buckets is that they are always at the end of a brigade, so to determine if we are at the end, we get the last bucket in the brigade and check if it is an EOS bucket. If it isn't, then we just pass the data we have just been given to the next filter with the ap_pass_brigade function.

If this brigade does have the end of the request, then we need to send out the footers for this page. To make processing easier, we treat both files and strings the same. In this case, that means sending the data we have before we try to send the footer data. If we didn't do this, and we tried to send a file using a sub-request then the file would be delivered before the data we have just been passed. To send the data we have, we must split the brigade just before the EOS bucket. This is done with ap_brigade_split. If the new brigade is not empty then we need to send it to the next filter. Oce we have either sent the brigade to the next filer, or done nothing with it, we must destroy the brigade, or we will leak memory.

At this point sending the footer file is the same as sending the header file, so we won't go into any detail about that. Sending the footer string starts by creating the bucket, same as the header string. This time, instead of inserting the bucket at the start of the array, we want to insert it just before the EOS bucket. We know that e is pointing to the EOS bucket, because we set that up earlier in the function, so we use AP_BUCKET_INSERT_BEFORE to make sure that the footer bucket is inserted in the correct place.

All we have left to do now, is re-compile the server with our new module and set it up to use the header and footer module. In this case, adding the filter to the filter chain is best done in the config file with the AddOutputFilter directive which is defined by the core. Edit your config file and find the Directory stanza for the directory you wish to add a header or footer to. The AddOutputFilter directive takes the name of the filter to add as its argument. In this case it is HEADERFOOTER. As long as we are here, let's also add two lines to add a header and footer file to the files in this directory. Be careful when adding a header and footer to a page. If you add a header file to a whole directory, and the header file is in that directory, then your server will recursively send the header file forever. In the configuration below, we are only adding the header and footer for the index page.

<Directory "/home/rbb/apachebin1/htdocs">
    AddOutputFilter HEADERFOOTER
    
        HeaderFile header
        HeaderText "This is a header text"
        FooterFile footer
        FooterText "This is a footer text"
                                                                           
</Directory>

That was a very basic introduction to output filtering. Next month, we continue looking into Apache 2.0's filtering with input filtering. This will allow your module to modify the data that a browser sends on a POST or PUT request. I should also mention here that filtering is still evolving. This module works with Apache as it is right now. This module has not been tested with any of the official alpha releases of 2.0, but with slight modifications, it should work with any of the alphas after alpha 7.

You can download the mod_hf.c file here.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved