The Search Marketing Advisor Newsletter Article: October 2004, Volume 3, Issue 6
Website Traffic Statistics Part 2: Defining Industry Terminology
by Dave Greten, Algorithmic Search Analyst, iProspect
In last month’s edition of the Search Marketing Advisor we defined the commonly used statistics (hit, Web client, page view, visitor, and cookies) on which websites depend. Now that we have the terminology defined, we can discuss the problem of measuring Web traffic with 100% accuracy.
There are two principle ways to record traffic; server side tracking and client side tracking. Server side tracking utilizes data files recorded by the computer that is hosting the Web pages. Every time a request for a page comes in to that computer from a visitor, it is recorded on a server side log.
Client side tracking, also known as “page tagging,” uses an image generated on a tracking computer to identify unique visitors. Popular Web analytics packages, such as WebTrends On Demand, use client side tracking. With client side tracking, when a visitor comes to a Web page, a piece of JavaScript is embedded on the page to create an image on a tracking computer, “tagging” the user as a “unique visitor” and tracking their activity as they navigate the site. JavaScript is a scripting language designed to allow interactivity within static HTML pages.
Both server side tracking and client side tracking are valid techniques for recording website traffic, but both have drawbacks.
Whenever a user types in a URL or follows a link, a page request is sent by the browser to the Web server for that specific Web page. Because server side tracking requires a page request to be sent to the server, navigation done solely within the browser’s navigational system (e.g. “back” or “forward”) are not recorded. Another potential problem with server side tracking is that the system records the activities of automated search engine spiders, bots, robots, and crawlers – all of which visit sites – as visitors.
Yet another problem with server side tracking is the use of page caching. Page caching is the storage of pages on the visitors’ local computer for a faster user experience. When a page is requested, the client first looks to see if the page exists in the cache. If it does, that page is served to the user and the page request is never sent. While this results in a faster downloading experience for Web surfers, this visit is never recorded on the server side log since a page request was never received.
Client side tracking has its own set of problems as well. Because the system is dependent on images and JavaScript to track viewers, JavaScript must be enabled on the client’s browser and the user must be able to accept images for the system to track the user. If JavaScript is not enabled on the user’s browser, his/her visit is never recorded.
Because of these factors, both methods of tracking tend to undercount the number of website visitors. Both of the systems record visitors the other method cannot. While client side tracking is more useful to people who are interested in their users’ behavior, server side tracking is more useful to people who are more interested in system performance and making sure their servers can handle the total amount of traffic.
A problem common to both systems is the control that site visitors have in clearing their cookies (see last issue) within the Web client. When a visitor manually clears his/her cookies and revisits a site, that visitor is recorded as another unique visitor (now counted as a second visitor!). And if cookies aren’t manually cleared, they are still scheduled to expire after a certain length of time. This becomes troublesome when someone comes to a site, navigates for a bit, leaves and returns. Should this visitor be counted as two or one? After what length of time? Ten minutes? A day? A month? The current expiration standard is after a span of thirty minutes of inactivity.
As you can see, there are many factors that can complicate the measurement of Web traffic and make reconciling those measurements back to an “absolute” number or visitors to your site an inexact science. Although both methods of tracking are completely valid, outside factors will always contribute to making both of them equally imperfect – but both equally accurate at approximating your actual website traffic.