Getting the Most From Your Log Files

by Charlie Morris

Every Web site has a different set of goals, but there's one thing we all have in common: We want more traffic! Although a sure-fire way to build Web site traffic quickly remains as elusive as a sure-fire way to predict stock prices, there are some tried-and-true methods that can help you build your Web site traffic slowly but surely. The ambitious site owner will use various promotional tactics on an ongoing basis, but this article is not about any one traffic-building technique.

Every Web site has a different set of goals, but there'' s one thing we all have in common: We want more traffic! Although a sure-fire way to build Web site traffic quickly remains as elusive as a sure-fire way to predict stock prices, there are some tried-and-true methods that can help you build your Web site traffic slowly but surely.

The ambitious site owner will use various promotional tacti cs on an ongoing basis, but this article is not about any one traffic-building technique. It''s about using your We b server log files to direct your efforts and measure your success. You don''t have time to do everything, so you n eed to figure out what works, and spend your time accordingly. Careful analysis of the information in your log file s can give you lots of promising traffic-building ideas, and also help you measure which ones live up to their prom ise.

There are as many different reasons for having a Web site as there a re businesses or individuals, and every site has a different set of goals. There''s one thing we all have in common , however: We want more traffic! If you''re selling products online, then more traffic means more potential custome rs. If you''re advertising a product or service, more traffic means more people get your message. If you''re runnin g an ad-supported site, then more traffic means more page \ impressions, and more money in your pocket.

If I ha d some sure-fire way to build Web site traffic quickly, I''d be fabulously well-to-do, just as I would if I had a s ure-fire way to predict stock prices. There are some tried-and-true methods, however, and while they can''t create instant success, if applied diligently over time, they''ll help you build your Web site traffic slowly but surely.

  • Submit your site to the search enginesand directories, re-submitting periodically.
  • Study how the search engines work, and tweak your site to try t o maximize your search rankings.
  • Solicit links from related sites.
  • Run ads on appropriate Web sites and m ailing lists. These may be paid ads, or banner swaps with related sites. Automated banner exchange programs like LinkExchange and HyperBannerare also us eful.
  • Build a database of press contacts, and send out press releases about newsworthy events concerning your company or your site.
  • Participate in discussion groups and mailing lists that are relevant to your business, a nd discreetly plug your site and yourself.
  • Constantly develop good new content. This is the only sure way to g row traffic in the long run, but it also happens to be a lot of work.

The ambitious site owner will use all of these tactics on an ongoing basis, but this article is not about any one traffic-building technique. It' 's about using your Web server log files to direct your efforts and measure your success. A good marketer can alway s think of lots of things to do to promote a site, far more than could ever be accomplished with the time and money available. You can''t do everything, so you need to figure out exactly what works, and direct your efforts to the most effective tactics. Careful analysis of the information in your log files can give you lots of promising traffi c-building ideas, and also help you measure which ones live up to their promise.


How Log Files Work
Mining that Data< BR> Insights from your Error Logs
Where are they coming from?
Who''s sending them?

How Log Files Work

, an internet.com Web site.

How Log Files Work

Every time a file is retrieved from a Web site, the server software keeps a record of it (assuming that logging is turned on). The server stores this information in text files, (usually with a .txt or .log extension), called the Access Log, Error Log and Referrer Log. The log files contain not only a record of which pages were requested at which times, but a good bit of information about the people (or other entities) that requested them.

As you can imagine, log files can get huge very quickly, and take up an enormous amount of expensive hard drive space at your hosting service. Therefore, most Web servers are set up to "rotate" or "cycle" the log files in some way, to make sure that all the files get saved, but that they don't hang around on the server. A simple way to do this is to have the server automatically email a copy of the log files to somebody periodically. This lucky individual transfers them to some permanent storage location, and the server automatically purges the original log files after a certain amount of time.

If you want to have decent stats for your site, be careful about keeping your log files organized. It's a pain in the neck, but worth it - any gap in your data can screw up your reports, and once it's lost it's lost.

The wealth of data in the log files is not readily mined with the naked eye. A raw log file entry looks something like this: - - [19/Jul/1999:00:00:04 -0600] "GET /studio/drives.html HTTP/1.1" 200 20607 "http://www.webdevelopersjournal.com/studio/hard.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"

As you can see, this entry shows what page was requested, when it was requested, where the visitor came from, and even what browser and OS they were running. As I'm sure you can also see, you won't learn much of interest just by looking at the raw log files. There's page after page of this stuff.

To get the most out of the data, you need to be able to see totals for the whole site, and compare the figures over time. That's where a log analysis software package comes in. These handy tools range from Getstats (a free Unix program that can run on your Web server) to various cheap shareware options, to industrial-strength packages like Marketwave Hit List Pro 4.0 ($395 list) or WebTrends Log Analyzer 4.52($399 list).

Basic tools like Getstats can give you almost as much information as the pricey packages, but customization options are limited, and results are presented in plain text format. If you want pretty pictures and graphs for the marketing department, you'll need something like Hit List or WebTrends. For a comparative review of these two packages, see a review from Web Developers Journal.

Get the Most From Your Log Files
Mining that Data

, an internet.com Web site.

1354851','3','article','Mining that Data

Let''s assume that you''ve generated a nice long report, using the log analysis tool of your choice. We''ll go through each section of a typical traffic report, and see how many traffic-building brainstorms we can come up with.

Most Requested Pages

T his section is a gold mine. Here you can see which of your pages are bringing in the most hits. Most sites will expect to see their home page at the top, followed by the home or "hub" pages of the major sections of your site.

Most log analysis programs can fil ter out user-specified file types for this report, and many people set things up to show only .html or .htm files, to avoid clutteri ng the report up with image files, scripts and so forth. Sometimes it''s useful to see what''s going on with those other types of fi les, however. For example, a WebTrends report can include a list of the most submitted forms and scripts. Looking at this info for m y site, I realized that there was an old CGIscript, that we thought we had retired , still linked somewhere and churning out a few impressions a day.

Looking at the relative amounts of traffic that certain pages (or directories) get, you can get an idea of what people are using your site for. Are most of your visitors coming for product supp ort? for product information? Which products are they asking for information about? It''s tempting to conclude that pages or section s that get more traffic are the ones that visitors like more, but things aren''t that simple.

The amount of traffic a particular page or section gets may have more to do with the number and prominence of links that lead to it than with how much visitors like i t. For example, looking at the report for my site, The Web Developer''s Journal, I ca n see that the top five pages each generate a big percentage of the total impressions. Now, I happen to know that all five of these pages are ones that have earned a listing in Yahoo. That''s why they get lots of hitskys - there are plenty of other pages that read ers would probably like just as much if they were listed, too.

There''s one pearl of wisdom for you, in case you didn''t already know: the big directories generate lots of traffic, and Yahoo is the biggest of the big. If you don''t already have a listing in Ya hoo, it''s well worth your time to try to get listed.

The prominence of links on your site obviously has a lot to do with the am ount of traffic an interior page will get. A page that''s linked from your navbar on every page, or that has a big prominent link on the home page, will get a lot more action than one that''s just referenced by a hot link in some text somewhere.

Keep in mind t hat the popularity of an individual page can be interpreted in two opposite ways. On the one hand, if a certain type of content is g enerating more hits than other types, then it makes sense to go with a winner - push that type of content even more, and create new stuff in the same vein. On the other hand, if a section isn''t getting much traffic, and you think it should, it might be time to pu sh it a little more, by giving it more prominent internal links, and perhaps submitting it individually to directories and such.

Some programs can report not only the most popular pages, but things like:

Least popular pages- Remember, this doesn''t mean people didn''t like those pages, it means few people saw them in the first place. If this list contains pages of limited impor tance, that you wouldn''t expect many people to visit anyway, then you''re looking good. If you see important content pages on this list, you may want to take steps to steer more people toward them.

Top entry pages- Useful for finding out which pages a re linked from other sites, search engines, etc.

Top exit pages- Keeping visitors on your site as long as possible is a worthy goal, especially if you''re an ad-supported site. Analyzing pages that tend to make people want to leave may help you figure out how to make them a little stickier.

Single access pages- Pages that visitors access and exit without viewing any oth er page. These pages are entry pages to your site, but for some reason they don''t entice very many people to visit other sections. If you can figure out why not, you may be able to boost your overall traffic by a hair.

The most useful insights come when you s tart combining the individual page traffic with other data. For example, let''s say you''re selling products on your site. Your traf fic report tells you that your home page is getting healthy traffic, but your ordering page is getting only a tiny percentage of tha t. You replace the small text link to your ordering page with a big bright yellow image that flashes on and off, and says "Click her e NOW! Or else!" A month later, you run another report, and find that traffic to your ordering page has skyrocketed.

When you co mpare the traffic data for that page with your actual orders, however, you find that only a tiny fraction of the people who visited the order page actually placed an order. That is, your "conversion rate" (the percentage of visitors converted into customers) is lo w. This indicates that you should direct your efforts toward making the order page better before you spend time trying to build more traffic to it.

Often tiny changes in page layoutor wording can make a big difference in visitor behavior. Keep careful track of any modifications you make to your pages, then compare the dates of tho se changes to the dates of any increases or decreases in traffic, as revealed by your log reports.

How Log Files Work
Get the Most From Your Log Files
Insights from your Error Logs

, an in ternet.com Web site.

Insights from your Error Logs

Web site errors mean lost traffic. Fixing dead links and malfunctioning scripts may not sound like marketing work, but every visitor who fails to find what they're looking for, or leaves in disgust because your whizzy features don't work right, is a penny out of your pocket. A log analysis program can give you a breakdown of the different type of errors that occur.

To err is human, and humans built the Web, so even the most tightly-run Web site will log a few errors. If the percentage of requests resulting in errors seems unusually high, however, there may be problems that need to be fixed.

If you're getting a lot of 404s (page not found errors), then you may have some bad links on your site. You can find and fix them with a program like LinkBot, or with the Link Analysis cartridge which is included with the more expensive versions of WebTrends. Lots of script errors? Perhaps you have scripts somewhere that are malfunctioning. If so, track them down and fix them to convert those errors into impressions. A lot of script errors may also indicate that many of your visitors have older browsers that can't handle the newfangled stuff. Either figure out how to hide the scripts from the older browsers, or get rid of 'em.

Most Active Countries

If your Web site is aimed solely at US visitors, then you don't want traffic from other countries - it just ties up your server with visits from people who won't be buying anything. If your Web site has international appeal, however, then the more the merrier.

For US sites, the US will represent the lion's share of your traffic, with the UK, Canada, and Australia following. The next ones will usually be Germany and Sweden, followed by the other affluent Anglophiles of Western Europe.

The relative amounts of traffic from each country can tell you several things. If the percentage of your traffic that is from outside the US is tiny, then you may be able to realize a substantial overall boost by increasing your international traffic. Submit your site to international search engines and directories - all the major search engines have regional versions that you can submit to, and there are also various specialized search engines for particular regions, such as Euroseek.

If a particular country or region places high in the list of most active countries (out of the usual order listed above), then your content would seem to appeal to visitors from that region. You might try to capitalize on that by adding more regional content, or even translating some of your pages into another language. For example, the Latin American countries usually place far below the major English-speaking and Western European countries. If Mexico and Venezuela are placing just as high as Sweden and Holland, then it would appear that your site has South-of-the-border appeal. You might try to build on that by blitzing the Spanish-language search engines, or even adding a special South American page. On the other hand, as discussed above, you might take that as evidence that you have the Latin world covered, and concentrate on the Germanic world instead.

Mining that Data
Get the Most From Your Log Files
Where are they coming from?

, an internet.com Web site.

For most sites, the largest single source of traffic will be AOL. For the Web Developer''s Journal (which is aimed at in termediate-to-expert computer users), it''s about 4%. Since AOL has a consumer slant, one usually assumes that visitors coming from AOL are more likely to be computer beginners than experts. The more your site appeals to the less computer-savvy, the higher a perce ntage of AOL hits you should see. If the thought of all those beginning Web surfers makes you see dollar signs, but your AOL traffic is in the single digits, then here''s an area to work on. Make sure you''re listed in AOL''s directory, AOL Netfind, and consider a dding content that will make AOLers feel welcome (a big flashing logo that says "Welcome AOL users?" - I don''t know).

Behind AO L you''ll see some of the other mega-ISPs, such as Uunet, Mindspring and Netcom. Here''s a tip for you: Our traffic from Time Warner ''s Road Runner service has gone from zero to about 1.2% in a year or so. I don''t know what that means for building traffic, but it sounds like a stock to buy!

Top Users

If your site traffic is high, then any entity that makes it onto the list of top users is unlikely to be a human. Most of these will be either a spider or a cache. A spider is an automated program that visits your site f or the purpose of indexing it for a search engine. Obviously, spiders are welcome, but they won''t be buying anything (a bit like jo urnalists, actually), so it''s interesting to get a rough idea of what percentage of your traffic is being "lost" to spiders.

So me large ISPs "cache" frequently-requested pages (that is, store them on their local machines instead of retrieving them from the We b each time they''re requested), in order to save bandwidth. Caching is an important issue if you run an advertising-supported site, because page impressions delivered from a server cache will not be counted by your ad rotation software, and thus you can''t bill f or them (Actually, some of the higher-end packages try to compensate for this in various ways. See my comparative review of ad-management packages. It''s important to compare the traffic st ats that your ad-rotation package generates to the stats from the server log files. The former should be lower by approximately the amount of impressions being cached. If the discrepancy is substantially larger than this, there may be some technical problem that y ou''ll want to find and fix.

Visitor Browsers and Operating Systems

The more advanced analysis programs can give you a breakdown of your visitors by browser version and/or OS. Some Web servers keep this information in a separate log file, called a "re ferrer log" (traditionally misspelled "referer").

Looking at your visitors'' browser and OS versions can give you a rough idea o f how computer-savvy they are. Advanced users tend to have the latest browsers, and are more likely to use NT or Unix. If a large pe rcentage of your visitors are using old browsers, and a large percentage are coming from AOL (see above), then you may assume that m any of your visitors are newbies. Of course, this is not absolutely true, as many employees of large companies have no choice but to use whatever browser and OS version their IT department has decided to "standardize on."

As a rough guideline, however, this ha s two implications. First, if a large percentage of your visitors are using older browsers, then take it easy with advanced design t echniques like Style Sheets, Java and Javascript. Versions previous to 4.0 often choke on the latest doodads, even if you have a script to detect the browser type. Second, if you can get an idea of how tech-savvy your audience is, you''ll know what kind of content to concentrate on.

Insights f rom your Error Logs
Get the Most From Your Log Files
Who''s sending them?

, an internet.c om Web site.

This is one of the most valuable sections of all. The first step in increasing your traffic is knowing where your existing tra ffic comes from. The two biggest sources of traffic for most sites are first, the search engines and directories, and second, links to your site from other related sites.

Those sites fortunate enough to be listed in Yahoo will usually find that it is the top s ource of hits. If you ain''t listed, get on the ball. Yahoo is selective about what sites they list, and carefully and patiently pre senting your case to them is much more likely to yield results than a massive Spam assault. There are some very good tips on getting listed at SelfPromotion.com , and also in the little-known section of Yahoo called "How to Suggest Your Site"< /A>. If you are listed in Yahoo, but are getting only a small percentage of your traffic from it, then you may want to try to modify your listing, perhaps moving it to a better category, or making the site description more enticing.

After Yahoo, you should see the top search engines listed (Excite, Altavista, Lycos, Infoseek). There are a couple of places on the Web where you can find the latest stats on the relative amount of traffic each of these has (see Search Engine Sizes). The order in which they appear on your traffic report should roughly correspond to this. If not, then a p roblem with one or more of the search engines may be indicated. For example, if the word on the street is that Altavista is currentl y the most popular engine, but it''s sending you less traffic than Lycos and Excite, then your rankings on Altavista show room for i mprovement.

Getting listed on the various search engines is only half the battle. You also want to try to make your site come up as high as possible on the list of results for your favorite search terms. Everyone knows to use plenty of keywords in your page titles, METAtags and body text (and even filenames and ALT attributes), but every engine uses slightly different criteria for ranking sites.

The art of maximizing rankings in the various engines is an arcane one. Some folks go so far as to build ind ividual "gateway" pages, each one optimized for a particular engine. Others are parted from their money by sharpies who claim to hav e "secret formulas" for getting top billing. At the WDJ, we type the URL of the desir ed engine on rice paper, using a manual typewriter. Then we burn the paper, while our staff all link hands and dance in a ring aroun d it, chanting and drinking a lot of wine. Snake oil and false prophets abound, so think twice before committing money or time to an y scheme for improving your rankings. You''ll find some common sense ideas for bettering your rankings at Search Engine Watch.

The advanced log analysis tools can tell you not only which search engines are sendi ng you traffic, but also what keywords people are searching on to find you. This section can offer rich pickings. Are all these keyw ords included in your META tags? Are there certain keywords that you think should be yielding lots of hits, but aren''t? What might you do about it? Also, by comparing the same keyword across different search engines, you can get some ideas about the differences i n their ranking algorithms.

After the search engines, you''ll find listings for the various sites that have links to yours (You have been politely asking for links from related sites, haven''t you?). The sites near the top of your list are your buddies, so tre at them well. Make sure they always have the latest info about your site, and do them favors if you can.

As I''ve emphasized thr oughout this article, a high or a low ranking here can be interpreted in either a positive or a negative way. Not only should you tr y to reward those sites that send you lots of hits, you should turn your attention to sites that seem as if they should be sending y ou more than they do.

For example, let''s say you sell a software package, which is reviewed in two online magazines. One of the m sends you lots of traffic, the other a mere trickle. Why? Could you bribe the Webmaster of the second magazine to give you a more prominent link? Could you offer to break his or her leg if they don''t give you a more positive review? Could it be that their magaz ine just doesn''t get much traffic anyway, so you needn''t waste your time with them?

By following links to pages that have link s to you, you can often turn up scads of other promising places to solicit links. If a "links page" has a link to me, they may also have links to other links pages that should also link to me, and those links pages probably have links to other links pages... and o n and on until the sun comes up.

This leads to one of the most important points of this article. There are a trillion things you could do to improve your traffic, but you can''t possibly do them all. Careful analysis of your log files over time can tell you wh at yields results and what doesn''t. Use your imagination, but above all, use your judgment. Spend the bulk of your time on areas th at yield the bulk of your hits.

So, happy hit-hunting, Webmasters! If you''ve found another way to glean wisdom from your log fi les, drop me a line and let me know. Better yet, visit my Web site, and rack up a few page impressions while you search for my email address!

Further Reading

Where are they coming from?
Get the Most From Your Log Files

, an internet.com Web site.

This article was originally published on Monday Aug 2nd 1999
Mobile Site | Full Site