Machine Learning Authors: Pat Romanski, Yeshim Deniz, Liz McMillan, Elizabeth White, Corey Roth

Related Topics: Machine Learning

Machine Learning : Article

Does RSS and AJAX Make Pageviews Obsolete?

Website Metrics Reconsidered

Remember when web site traffic was talked about in terms of "hits"? You'd read about how many millions of hits Netscape got per month and other sites bragged about getting 30,000 hits a day.

Eventually, we moved away from the term "hit" because everyone realized it was pretty meaningless. You see, a hit was often counted (depending on who was counting them) not just for a page load, but for every element (e.g., graphic) included on the page, as well. One visit of this page, for example, would be worth about 40 hits (if the browser had images turned on). But a site that was less graphical and had equal usage would register half the hits.

Pageviews replaced hits as the primary traffic metric - not just because they're more meaningful, but because it also determined how many ads could be served. Ads were sold primarily on a CPM basis, so multiply your CPM by every 1,000 pageviews you got, and that's your dot-com revenue.

Reach (number of unique visitors) is also important, of course. comScore/Media Metrix uses uniques as its primary metric, because mainstream advertisers want to reach a lot of people, not just the same people over and over. You can also get pageviews, time spent, and several other data points from Media Metrix, but if you're the number one site on MM, it's because you have the most unique visitors for the month. Of course, if uniques were all that mattered, Blogger.com would be considered as big as MySpace by some accounts:

Whereas, if you look at pageviews, MySpace dominates:

That's why Alexa Rank is a combination of Reach + Pageviews, so you get something like this:

But it's this pageviews part that I think needs to be more seriously questioned. (This is not an argument that Blogger is as popular as MySpace—it's not.) Pageview counts are as susceptible as hit counts to site design decisions that have nothing to do with actual usage. As Mike Davidson brilliantly analyzed in April, part of the reason MySpace drives such an amazing number of pageviews is because their site design is so terrible.

As Mike writes: "Here's a sobering thought: If the operators of MySpace cleaned up the site and followed modern interface and web application principles tomorrow, here's what the graph would look like:"

Mike assumes a certain amount of Ajax would be involved in this more-modern MySpace interface, which is part of the reason for the pageview drop. And, as the Kiko guys wrote in their eBay posting, their pageview numbers were misleading because the site was built with Ajax. (Note: It's really easy to track Ajax actions in Google Analytics for your own edification.)

But Ajax is only part of the reason pageviews are obsolete. Another one is RSS. About half the readers of this blog do so via RSS. I can know how many subscribers I have to my feed, thanks to Feedburner. And I can know how many times my feed is downloaded, if I wanted to dig into my server logs. But I don't get to count pageviews for every view in Google Reader or Bloglines or LiveJournal or anywhere else I'm syndicated.

Another reason: Widgets. The web is becoming increasingly widgetized - little bits of functionality from one site are displayed on many others. The purveyors of a widget can track how many times their JavaScript or Flash file is loaded elsewhere - but what does that mean? If you get a widget loaded in a sidebar of a blog without anyone paying attention to it, that's not worth anything. But if you're YouTube, and someone's watching a whole video and perhaps even an ad you're getting paid for, that's something else entirely. But is it a pageview?

Pageviews were never a great measure of popularity. A simple JavaScript form validation can easily cut down on pageviews (and save users time), while a useless frameset can pump up your numbers. But with the proliferation of Ajax, RSS, and widgets, pageviews are even more silly to pay much attention to—even as we're all obsessed with them.

It's about Time

So what's a better measurement? Good question. Like many good questions, the answer is "it depends." If you're talking about what's important to pay attention to on your own site, you have to determine what your primary success criteria are and measure that as best you can. For some sites, that could be subscribers, or paying users, or revenue, or widgets deployed, or files uploaded, or what have you. It may even be pageviews.

At Blogger, we determined that our most critical metric was number of posts. An increase in posts meant that people were not just creating blogs, but updating them, and more posts would drive more readership, which would drive more users, which would drive more posts. Of course, posts alone wasn't our only measure, because someone could have written an automated posting script to fill up our database (which some did), and by that metric, we're happy about it. So we paid attention to pageviews and posts per user and user drop off, and other things.

Of course, we all want to know how we're doing compared to other people/sites/companies, so internal metrics aren't enough. And things like Media Metrix and Alexa are paid attention to by investors, and advertisers, and acquirers, and the press. So some apples-to-apples comparison is useful. If I had to pick one, in addition to unique visitors, I'd say time spent would be much more useful than pageviews.

After all, everyone's competing for a bigger share of the one scarce resource, which is people's attention (although it is a growing resource, because people keep making babies, and those babies keep getting Internet connections). More or less, what you need is people's attention before you can meet whatever goals you have.

Time spent interacting with a site is a much better basis on which to compare sites' relative ability to capture attention/value than pageviews is. When it comes to media like audio or video, an increasing percentage of the web consumption, time obviously means a great deal more than a pageview.

However, time is a bit harder to measure. HTTP, being stateless, doesn't actually have a concept of time spent. If you read this whole post and then click off to another site, my web server won't know whether you were here for five minutes or five seconds. However, most web analytics packages do estimate time spent, as does Media Metrix. (The Alexa toolbar could actually measure it even better.)

Widgets are still a bit tricky, because a user may or may not be paying any attention to a widget that's on a page they're viewing. If you could measure time spent interacting with a widget (or media being streamed through the widget), that would be ideal. RSS consumption is harder to measure by time, but there are other efforts to measure attention in that realm.

Finally, there's a big argument against time as a measure: People don't spend much time on Google search, because it gives them what they want so fast, and they go away. Which is obviously good for them and for users. Of course, Google doesn't drive many pageviews per visit either, but it's so good people return again and again. So aggregate time is probably still high. But just as pageviews can be gamed, you can slow your users down unecessarily (or accidentally because your servers are too slow) and increase time spent. In the long run, this is going to be bad for you, but it would screw up a market that paid too much attention to time spent, just as much as BS pageview counts do now.

In summary, there's no easy solution. There's a big opportunity (though very tough job) for someone to come up with a meaningful metric that weighs a bunch of factors. But no matter what, there will come a time when no one who wants to be taken seriously will talk about their web traffic in terms of "pageviews" any more than one would brag about their "hits."

This essay appeared first August 25, 2006, on Evan Williams' own website, www.evhead.com, which has been up since 1996 and in blog form since 1999. It appears in the "i-Technology Viewpoint" series with the author's permission. Copyright © Evan Williams, 2006  [The SYS-CON Media copyright tag on this page (below) does not apply and is invalid in this case, appearing merely as a default part of the frame of this page. SYS-CON Editorial.]

More Stories By Evan Williams

Evan Williams io CEO of Odeo, Inc., a startup based in San Francisco, where he has lived since 1998. Previously, he was co-founder and CEO of Pyra Labs, makers of Blogger, now part of Google, where he worked most of 2003-04. Originally from the cornfields of Nebraska, he currently lives in San Francisco with his fiancée (and their cat). You can find more about him on the web - but, says Ev, "don't believe everything you read!" His blog can be be found at www.evhead.com.

Comments (8) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

Most Recent Comments
Greg Deckler 09/19/06 01:01:13 PM EDT

The ad and unique users is what is important, not the pageview or any other metric. Hits are still useful to administrators for determining and scaling sever load. Use the webit instead.

n d 09/07/06 08:36:00 PM EDT

Pageview counts are as susceptible as hit counts to site design decisions that have nothing to do with actual usage. That, argues Evan Williams, is part of the reason MySpace drives such an amazing number of pageviews: it's because their site design is so terrible. So what's a better measurement?

n d 09/07/06 08:35:11 PM EDT

Pageview counts are as susceptible as hit counts to site design decisions that have nothing to do with actual usage. That, argues Evan Williams, is part of the reason MySpace drives such an amazing number of pageviews: it's because their site design is so terrible. So what's a better measurement?

n d 09/07/06 04:48:24 PM EDT

Pageview counts are as susceptible as hit counts to site design decisions that have nothing to do with actual usage. That, argues Evan Williams, is part of the reason MySpace drives such an amazing number of pageviews: it's because their site design is so terrible. So what's a better measurement?

Don Babcock 09/06/06 09:45:21 AM EDT

The article defeats its own title. It seems to clearly demonstrate that pageviews are a poor general metric regardless of RSS and/or AJAX.

But it seems also to set up a straw man of sorts. It suggests that there is yet some "holy grail" in the form of an abstract metric when the very idea of an abstract metric is a contradiction in terms. Supposedly, you could derive some magical "score" from a formula that combined a numnber of different factors and coefficients in such a way that sites could then be compared in terms of the metric. This is needless complexity in a quest for bragging rights in the sense of "mine's better than yours because it scores higher on the so-and-so scale." Managers continue to fall for such nonsense measures (like Microsoft Certified Professional, et al.) The more perfect a measure is in the general case, the less usable it is in the specific case. But it is almost always the specific case in which the real questions are framed. So if I'm interested in reach for questions of advertising efficacy there are ways to measure that. If I'm interested in site "participation" there are ways to measure that. But let's abandon the idea that there is some sort of all around metric that we can derive that will be of any real use. It comes down to the same kinds of questions you face in hiring. Is this particular candidate suitable for the job at hand? In most cases some general metric like passing an arbitrary exam won't tell you anything. Witness the ill-conceived idea of teacher testing.

I build websites for use in clinical research data gathering. They are very different from "typical" web fare. You have to evaluate each site in light of the purpose for which it was created. Evaluating the worth of a particular site isn't hard. Evaluating the fitness of a particular candidate for a particular job isn't hard either. You just have to do the work. It's when you try to develop "shortcuts" in the form of general metrics that you make the straightforward something complex. It's nothing more than the "silver-bullet" anti-pattern in management in yet another form.

MattHawk 09/01/06 02:55:21 AM EDT

Google's design is lightweight. MySpace does not even pretend to suffer from this convenience. Google might very well account for more unique visitors, but MySpace makes up for the visitors by having each page view result in a significantly greater amount of bandwidth usage.

Not to mention, if Google is working in its optimum capacity, it minimizes page views — if you only load the front page, and then find what you're looking on in the first page of search results, it doesn't generate many page hits."

Matthew Rothenberg 08/31/06 09:55:33 PM EDT

Pageviews aren't just a poor technical solution to measuring engagement due to rich interfaces - they're also a fundamentally flawed way of looking at user engagement on participatory media / social software websites.

What's the comparative value of a user who spends two hours a day on your site passively browsing various media content, versus a user who spends twenty minutes a week on the site, but spends that time uploading interesting content and/or actively commenting on other people's work? I'd argue the second user is not only more engaged but also more valuable to the website (and their advertisers) in the long-run.

Explication Please 08/31/06 09:15:36 PM EDT

So why is it again that MSN's Pageviews outstrip Google's by a fair margin?

CloudEXPO Stories
Containers and Kubernetes allow for code portability across on-premise VMs, bare metal, or multiple cloud provider environments. Yet, despite this portability promise, developers may include configuration and application definitions that constrain or even eliminate application portability. In this session we'll describe best practices for "configuration as code" in a Kubernetes environment. We will demonstrate how a properly constructed containerized app can be deployed to both Amazon and Azure using the Kublr platform, and how Kubernetes objects, such as persistent volumes, ingress rules, and services, can be used to abstract from the infrastructure.
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
DXWorldEXPO LLC announced today that "IoT Now" was named media sponsor of CloudEXPO | DXWorldEXPO 2018 New York, which will take place on November 11-13, 2018 in New York City, NY. IoT Now explores the evolving opportunities and challenges facing CSPs, and it passes on some lessons learned from those who have taken the first steps in next-gen IoT services.
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Modern software design has fundamentally changed how we manage applications, causing many to turn to containers as the new virtual machine for resource management. As container adoption grows beyond stateless applications to stateful workloads, the need for persistent storage is foundational - something customers routinely cite as a top pain point. In his session at @DevOpsSummit at 21st Cloud Expo, Bill Borsari, Head of Systems Engineering at Datera, explored how organizations can reap the benefits of the cloud without losing performance as containers become the new paradigm.