Analyzing IIS6 Web Logs With AWStats

Wednesday Dec 14th 2005 by Bruno Zambetti
Share:

Need to know what your Web site users are visiting in your sites? Try this free and open source Web log analyzer you can use with IIS6-hosted Web sites and applications.

Web server logs are a powerful resource for extracting information about Web sites and applications. When logging is enabled, Web servers log information about each request. Analyzing the logs reveals information about which resources are popular, what kinds of browsers people are using, how much bandwidth a Web site consumes, the request trend during a given time span, and so on.

Because log analysis is so important, many commercial software applications create useful and good-looking reports. Unfortunately, most are also very expensive. In addition, some log analyzers have a performance impact on the server itself. For example, recent WebTrends (one of the more popular commercial packages) versions require considerable RAM and processor time. If you can't afford to devote an entire server to Web log analysis and report creation, then you need an inexpensive and fast solution that isn't terribly resource-intensive.

Fortunately, such as solution exists, and its name is AWStats.

AWStats

AWStats is a free, open source log analyzer. It can analyze a wide variety of logs, and it creates good reports — not the full-featured, interactive, and special-effect-filled reports costly commercial software generates — but still useful, with all the important data presented tastefully. It is also very fast, and it doesn't consume many resources. In fact, most of the time, AWStats doesn't consume any resources at all.

AWStats is a Perl application with a simple structure. It requires an interpreter that can execute Perl scripts and has a small footprint on the server. The Perl source is neither very clear nor particularly readable, so it's not easy to change the behavior of the software, but you can use it "as is" for most common needs.

AWStats works in two modes. The first is interactive, or "online," mode. When used in online mode, AWStats updates its reports on request. The second is "offline" mode, in which AWStats analyzes data and creates static reports as HTML pages publishable via any standard Web server.

This article discusses how to install and use AWStats only in offline mode, which is preferable because it minimizes security risks and resource usage, so as to not undermine Web server performance. For example, report generation is scheduled in offline mode to run during off-peak hours.

This article also looks at how to install AWStats, analyze logs, and publish generated reports from IIS 6 log files. Code referenced in this article is available for download.

Installing AWStats

To install AWStats on Windows, download a Perl interpreter (if you don't already have one) and the AWStats software (scripts). I recommend ActivePerl 5.8, which you can download from http://www.activestate.com/ for free. ActivePerl 5.8 installs using a standard MSI installation. During the installation process, you have the option of installing an ISAPI Perl extension, but I recommend not installing it, as you don't need it to run AWStats in offline mode However, if you do want to run AWStats (or other Perl scripts) online, go ahead and install it.

Next, download AWStats, which as of press time is in version 6.4.

Author's Note: Be aware that earlier AWStats versions have known security problems, so if you're already running AWStats and haven't updated to the latest version, do so immediately. If you're planning to install AWStats, make sure you get the latest version before performing the install.

The download is a .zip file that you can expand wherever you want. The AWStats .zip file contains three folders: docs, tools, and wwwroot. I suggest deleting the docs folder. You can find documentation online, or you can copy it on your workstation, but you don't need it on your production server. Next, create a new folder where you will copy only the files you really need to run AWStats, both to minimize the installation complexity and to create a "personal distribution" of the software. For example, you can create an E:myAppsawstats-6.4bin folder. This is the folder I've used for the sample application; I'll refer to it from here on out as simply the bin folder.

Here's the rest of the procedure:

  1. Copy to bin the folders named css, icon, lang, lib, and plugin from the wwwwrootcgi-bin folder created when you unzipped the AWStats download.
  2. From that same folder, copy the file awstats.pl.
  3. Next, copy the awstats_buildstaticpages.pl file from the tools folder.
  4. Finally, create a bindirdata folder that you'll use to manage AWStats' databases.

That's it. You don't need any of the other files and folders from the download to run AWStats in offline mode.

You'll also use the bin folder to hold the configuration files defined later in this article as well as batch scripts to automatically run the application.

>> AWStats' Structure

This article was originally published on DevX.com.

AWStats' Structure

You control AWStats by creating or (more often) altering configuration files. A configuration file is a text file with a .conf extension that contains information about a Web site's logs. You must create a configuration file for each Web site you want to analyze.

You should name your configuration files using the naming pattern awstats.CONFIGNAME.conf, where "CONFIGNAME" is variable and is the name of the configuration you want to create. You can later refer to specific configuration files using that variable name. For example, when you instruct AWStats to apply a "CONFIGNAME" configuration, it will look for a file named awstats.CONFIGNAME.conf in the bin folder.

A configuration file can "include" options defined in another file, so it's best to create a base/general configuration file that contains the standard options you want to apply to all reports, and then define a specific file for each Web site. The site-specific file contains information such as where the logs to analyze are, how to process them for this site, and so on. Extracting the standard options into a separate general file and then including that in the site-specific files simplifies managing a hosting environment where you want to analyze data for many Web sites.

AWStats comes with a template configuration called awstats.model.conf that you can use to create new files (you can find it in the wwwrootcgi-bin folder from the installation).

Put all .conf files in the bin folder you created earlier.

IIS Logs

IIS 6 can record data in either log files or database tables; the precise fields it records are configurable. To use IIS6 with AWStats, you need to make sure that IIS6 logging is turned on, that it's saving data to log files, and you need to specify which data fields to collect and how to write them. You can control these options from the IIS MMC snap-in management application, editing the log properties for a specific Web site or for all Web sites simultaneously. You can launch the IIS MMC through Control Panel by double-clicking the Administrative Tools item, and then Internet Information Services

After launching the IIS MMC:

  • Enable extended logging capabilities by checking the "Enable logging" option.
  • Accept the default "W3C Extended Log File Format" option.
  • Next, click the Properties button to edit the logging options.
  • Set up a log schedule (I suggest monthly, because that schedule makes it easy to manage through AWStats configuration). Check the "Use local time for file naming and rollover" option; however note that doing so doesn't change the way IIS records time values in the log but only how it manages files.
  • Select a log file folder (I suggest moving the logs to a different one than the default, so you can have a simple log storage path for all your Web sites).
  • Set the fields that IIS will log. To do that, select the Advanced tab, and check only these options: date, time, client IP, user name, method, URI stem, URI query, protocol status, bytes sent, protocol version, user agent, and referrer. Check your selections, because AWStats log analysis depends on having the correct format.

Finally, if there's an existing and active log file for your Web sites, delete (or rename) it, which will force IIS to create a new one and begin filling it using the new format. If you cannot access this file because IIS is using it, restart IIS by entering iisreset.exe at a command prompt.

Configuration Profiles

It's best to edit AWStats configuration files using WordPad (not Notepad or another Windows text editor), because the provided templates are in a Unix file format, which uses a single line feed (LF) character for line endings rather than a carriage return/line feed (CR/LF) character pair as Windows does, so it's not easy mange them with Notepad). A typical configuration file for a Web site might look like the following (note that I've removed all AWStats' template notes, but you can find them included in the downloadable code).

   Include "awstats.common.conf"
   LogFile="E:LogsWebW3SVC529796009ex%YY-0%MM-0.log"
   SiteDomain="www.website.com"
   HostAliases="www.website.com"

These files tell AWStats to look for IIS log files in a specific directory. By default, IIS dynamically creates file names using an exYYMM.log pattern. For example, for November 2005, IIS would use a file named ex0511.log. You can refer to the AWStats documentation to check supported patterns.

The included file awstats.common.conf is the "base" file discussed above that you can use to manage all shared configuration options. You'll "include" this file with more specific configurations. Create the file as a copy of the awstats.model.conf template provided with the AWStats installation. You can leave all options as defined in this template, remove all "include" commands, and overwrite the following entries:

   LogFormat="date time cs-method cs-uri-stem cs-uri-query 
      cs-username c-ip cs-version cs(User-Agent) cs(Referer) 
      sc-status sc-bytes"
   
   DirData="E:myAppsawstats-6.4bindirdata"
   
   DirIcons="icon"
   
   Logo="logo_huge.gif"
   LogoLink="http://www.huge.it"
   
   LoadPlugin="timezone +1"

In the preceding code, the LogFormat option is one of the most important options. Logformat=2, the default option for IIS logs, does not work on IIS 6 because that option records data with an order that AWStats doesn't recognize. Instead, you must explicitly specify the correct parameters as shown above. The pattern I provided works well on IIS 6, so you can simply cut and paste it.

DirData is the working folder for AWStats that you can define on your system. AWStats will save its working file (databases on statistics) in this folder. Although you can change it to whatever you like, the example uses the bindirData folder you created during the installation.

DirIcons is the folder that contains all graphic files used to create reports.

Logo defines the file name of the logo to publish on the reports (typically your company logo), and LogoLink defines the URL for the image. In this case, you may put the logo file in the biniconothers folder. You can leave the standard configuration values for experimentation purposes, but you should be aware that you can create reports with a custom logo.

The LoadPlugin timezone option tells AWStats to "correct" time values found in log files. IIS records data using Greenwich Meridian time, so, if you want your reports to be based on your local time zone, you have to correct the values from the logs. For example, "timezone +1" adds 1 hour to time values recorded by IIS (+1 is Italy's time zone, so with this setting I can read reports with results based on my local time). While AWStats manages time zone adjustment using this setting, it doesn't correct for daylight savings time, so you must manually update this value for daylight savings time changes.

>> Running AWStats

This article was originally published on DevX.com.

Running AWStats

Running AWStats is quite simple: You need run only awstats.pl and awstats_buildstaticpages.pl; however, you do need to provide some parameters. The main parameter is -config, which defines the configuration file used by the program to analyze data and create reports.

Eventually, you'll schedule report generation automatically, but at first it's useful to run AWStats directly from the command line, both to see how it works and to test that your configuration files are OK. First, create an awstats.www.companysite.com.conf file in your bin folder as described in the previous section. You will need to have an IIS log file to analyze, so you can point your .conf file at the log file you specified using the IIS MMC.

Now select Start, Run, and enter the command cmd.exe /f:on. In the command window that opens, change directories to the bin folder that contains all the AWStats resources.

To run your first analysis, enter the command:

   awstats.pl -config=www.companysite.com

That command causes AWStats to open your log file, analyze it, show the results, and then end. If everything worked, you'll find a new file in your bindirdata folder. The actual file name depends on the configuration file name, the analyzed month, and so on. If any problems occur (you can check this from the results report), make sure that your Perl interpreter is installed and working, your .conf files are defined correctly, and that your IIS log file is in the correct format, and exists in the location where AWStats expects to find it. If you find that AWStats failed to decode only some rows (the results will indicate such problems), don't panic. If most rows are decoded, that's Ok; otherwise you probably have problems with your IIS log format.

Run the same command again. AWStats now runs very fast, skipping all the records it's already analyzed. That's because it builds on old data, checking and analyzing only new records in the IIS log.

After analyzing the log data, you can make AWStats create a report. Create a folder to contain the report (for this example, I used E:Reports, and copy the binicon folder to it. Also create a www.companysite.com directory in this folder, to archive all report files for this particular configuration/Web site (you can create a different folder for each configuration/Web site you manage). Now, from the same command window you used previously, run this command:

   awstats_buildstaticpages.pl -update -config=www.companysite.com --
      dir=E:Reportswww.companysite.com -diricons=../icon 

Make sure to change the path to the report path you created. The preceding command causes AWStats to create an HTML-formatted report in the specified folder. You can open the HTML file with a browser to see the analyzed data.

As you can see, after setting up the correct folders and configuration, analyzing log data and creating reports requires running only two scripts. In fact, you actually need to run only buildstaticpages.pl, because that runs awstats.pl in the background. Remember, you launched it manually only to check the output and verify that the environment is OK.

So, scheduling report generations is simple; however, you still need to keep security in mind—a topic I'll discuss a little later.

Some Tips for Running AWStats

AWStats is powerful and simple to use, but it does have some problems running on Windows. To save you time, I've created a list of tips that you can use to run AWStats with fewer problems.

  • AWStats creates a cache of data it analyzed for each Web site (one cache for each .conf file created). Every time you run it, it checks to see if the cache contains previously analyzed data; if so, it uses those to avoid reanalyzing the entire log file. Instead, it starts reading the log immediately after the last line read during the previous execution. So, if you need to clear all the data and reanalyze your logs, you must delete the cache files. You can find them in the dirdata folder you created during your installation.
  • AWStats  analyzes data in strict sequence. So, for example, if you have already analyzed the October log file, you can't later analyze the September log. If you must work out of sequence, first delete the cache files from your dirdata folder, and then analyze the September log before the October one.
  • AWStats skips log files with incorrect formatting. If that happens, stop IIS 6, rename (or delete) the current log file, make sure the log file options are correct, and then restart the Web server. IIS 6 will then create a new log file with the correct format, which you can analyze with AWStats.
  • AWStats can do DNS lookups of IP addresses that it finds in log files. This can be a nice feature because it gives you more information about where the requests are coming from, but it also requires a lot of time because AWStats must query the DNS server for each IP. So, despite the advantages, it's usually better not to enable this feature.
  • By default, AWStats focuses on "monthly" reports, analyzing and creating reports with a month-centric view (in fact, it focuses the report on the current month by default). If you want to have different reports, you can specify a specific month or date range.

>> Scheduling AWStats

This article was originally published on DevX.com.

Scheduling AWStats

Even though you're using AWStats in offline mode, you probably want to create updated reports automatically. The simplest way to do this is to use the Windows scheduler. First, create a new standard user account on the Web server (or in Active Directory) with no extended rights (you can create the account in the Users group). It's a good idea to assign a strong password to this account. You'll use the account to create a scheduled task and nothing else.

Next, create a batch file to launch data analysis for each Web site for which you want to create reports. Here's a typical batch file that analyzes three separate Web sites on the server:

    start /low /wait awstats_buildstaticpages.pl -update 
      -config=www. companysite.com 
      -dir=E:\Logs\Reports\www.companysite.com -diricons=../icon
   
   start /low /wait awstats_buildstaticpages.pl -update 
      -config=www.companysite2.com 
      -dir=E:\Logs\Reports\www.companysite2.com -diricons=../icon
   
   start /low /wait awstats_buildstaticpages.pl -update 
      -config=www.huge.it -dir=E:\Logs\Reports\www.huge.it 
      -diricons=../icon
 

Save the file with a .bat extension. Note that the batch file uses start.exe rather than running the Perl scripts directly because Perl is an interpreted language, and you cannot define a task priority or a maximum CPU usage value when you run a Perl program. Running the commands with start.exe, and passing the /low parameter runs Perl in low-priority mode, letting the Windows process scheduler assign more CPU time to standard programs, and running the log analyzer with less impact on the overall system. The /wait option causes start.exe to wait until program execution completes before running the next command. If you omit the /wait option, the batch file will launch all the defined AWStats processes (three in this case) at one time, which will consume too many server resources.

Using a similar batch file and the user you defined, you can create an NT scheduler task to update your log reports at off-peak times and at convenient intervals.

Security and NTFS settings

When you schedule (or run) a program, it's best to restrict its permissions as much as possible. To create reports, you can assign NTFS permissions to the user account you created to run the scheduled task. You'll have to assign these permissions:

  • Execute, on c:\program files\perl
  • List folder contents only (or the less restrictive read option) on the root folder where you put the AWStats files (the root of the disk containing the \bin folder)
  • Execute, on the \bin folder
  • Modify, on the \bin\dirdata folder
  • Read, on the folder containing the IIS logs
  • Modify, on the folder where you want AWStats to create the report files

Setting only these permissions restricts file and folders access to the user account running AWStats — and it's a common "best practice" for every server and application.

AWStats has many additional options and features beyond the ones mentioned here. You can find the complete documentation online, so you can easily find and test additional features yourself. The documentation is quite Linux-centric, but after you get AWStats working on Windows as described in this article, you'll find that you can refer to the documentation with few problems.

This article was originally published on DevX.com.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved