Hits or misses
You know that computers run the Internet. You know computers are efficient and flawless counting machines. Conclusion: Internet websites must know with excruciating accuracy who is visiting them. Right?
Wrong.
Surprising as it may seem, most websites have no idea how many people view their content. This inherent fuzziness is causing problems for commercial websites, especially online publications desperate to make money from Internet advertising.
The challenge: How can you charge for ads when it's nearly impossible to tell advertisers how many people will see them?
Newspaper sites in particular need to know who's visiting and how often. With few able to charge for online subscriptions and still keep their audience, advertising revenue becomes crucial to making the sites financially viable.
In the nation's 13th-largest market, the online version of the Tampa (Fla.) Tribune and WFLA-TV, called TBO.com, uses no less than six different methods - from outside surveys to in-house counts - to try to measure its audience. "Each measurement has its frustrations," says Kirk Read, general manager of TBO.com. "No method seems to be exact, so we use a number of different resources.... We're still trying to get our arms around it."
Smaller newspapers with less money to spend on research are even more in the dark.
"You'd think that the Web would be the most easily measured medium because all the interaction is electronic, and we can track the interaction. But it turns out to be not so simple," says Greg Harmon, author of a white paper on counting traffic at newspaper websites for Belden Associates of Dallas, a newspaper research and consulting firm. One of the "holy grails" that websites seek is a way to monitor their audience and understand its demographics without having users register.
How fuzzy are the numbers? Consider the widely used gauge called "unique visitors." That's commonly regarded as the number of different computers that visit a website as measured by that website's log or counting software. But actually, it measures the number of Web browsers that access a site. So if someone uses Explorer to reach a site, then accesses it again from the same computer using Netscape, the website logs two unique visitors.
In any case, websites don't want to know how many computers visited them. They want to know how many people did. Therein lies the rub. For example, if an Internet user visits, say, this newspaper's website (www.csmonitor.com) at 10 a.m. from her work computer, then checks in again at 8 p.m. from home, she'd be counted as two unique visitors. On the basis of its own surveys, Belden estimates that half of the daily users of a newspaper website access it from more than one computer. "That's a lot of double counting," says Mr. Harmon in a phone interview.
That's not the only problem. Computerized programs, often called "spiders" or "webbots," drop by to map the website on behalf of search engines, such as Google. "Spiders can lie about who they are" and when they visit, says Joel Abrams, who's been studying the visitor-counting problem for csmonitor.com. They can get mixed in with real human users and increase unique visitor counts by 10 to 15 percent, Harmon adds.
Overall, the Belden study concludes that the number of different people visiting a site may be as little as one-fifth of the number reported in visitor logs.
"The scale of the audience is dramatically smaller," Belden's Harmon says, but that news needn't be seen as all bad. It also means that the "unique visitors" represent a much smaller but also much more loyal group coming to a website much more often than the data originally suggested.
This newspaper recently decided to count as visitors only those who allow their computers to store cookies - small bits of identifying data on their browsers. But even counting cookies presents problems - the most obvious being that, again, computers, not people, get tabulated. And, for privacy reasons, many Web surfers today are either setting their browsers to refuse cookies or are periodically deleting them.
Belden's data suggest that 30 to 40 percent of Web surfers delete their cookies as often as once a week. "We're encountering a good deal of disbelief on this" in the industry, Harmon says, adding that the subject of cookie-deleting needs more research.
High-traffic Internet sites often employ outside survey firms, such as Nielsen/NetRatings or comScore Media Metrix, to measure site traffic. Both use huge "panels" (comScore claims to have 1 million panelists) of computer users who, in exchange for incentives, permit the companies to track which websites they visit.
But because many companies won't allow their workplace computers to be used for such purposes, "these panels clearly undersample at-work users," says Richard Gordon, a professor of online journalism at Northwestern University in Evanston, Ill.
Survey firms claim they do accurately assess the at-work market. But observers suggest that more and more visitors are being required to register before they can view a website. Even though visitors don't always give accurate information (one popular answer for ZIP Code is "90210"), registration produces the most accurate aggregate visitor numbers. "All of the major news sites" are moving in that direction, Professor Gordon says.
Prominent newspaper sites that already require registration include the New York Times, the Washington Post, and the Los Angeles Times. TBO.com is also in the process of moving to sitewide registration. Any short-term drop in unique visitors or page views will be compensated for by gaining a clearer picture of who the loyal readers are, Mr. Read says.
Not every potential reader wants to bother with filling out a registration form, which typically asks for the visitor's name, age, gender, address, and other personal information. And the Web's early tradition of allowing visitors to view sites in nearly total anonymity is hard to change. The website www.bugmenot.com, for example, has sprung up to maintain that privacy. It offers a group of communal passwords that visitors can use to access newspaper sites without registering for themselves - or, for that matter, being counted.