Machine Learning Authors: Ed Featherston, Corey Roth, Peter Silva, Yeshim Deniz, Automic Blog

Related Topics: Machine Learning

Machine Learning : Article

Does RSS and AJAX Make Pageviews Obsolete?

Website Metrics Reconsidered

Remember when web site traffic was talked about in terms of "hits"? You'd read about how many millions of hits Netscape got per month and other sites bragged about getting 30,000 hits a day.

Eventually, we moved away from the term "hit" because everyone realized it was pretty meaningless. You see, a hit was often counted (depending on who was counting them) not just for a page load, but for every element (e.g., graphic) included on the page, as well. One visit of this page, for example, would be worth about 40 hits (if the browser had images turned on). But a site that was less graphical and had equal usage would register half the hits.

Pageviews replaced hits as the primary traffic metric - not just because they're more meaningful, but because it also determined how many ads could be served. Ads were sold primarily on a CPM basis, so multiply your CPM by every 1,000 pageviews you got, and that's your dot-com revenue.

Reach (number of unique visitors) is also important, of course. comScore/Media Metrix uses uniques as its primary metric, because mainstream advertisers want to reach a lot of people, not just the same people over and over. You can also get pageviews, time spent, and several other data points from Media Metrix, but if you're the number one site on MM, it's because you have the most unique visitors for the month. Of course, if uniques were all that mattered, Blogger.com would be considered as big as MySpace by some accounts:

Whereas, if you look at pageviews, MySpace dominates:

That's why Alexa Rank is a combination of Reach + Pageviews, so you get something like this:

But it's this pageviews part that I think needs to be more seriously questioned. (This is not an argument that Blogger is as popular as MySpace—it's not.) Pageview counts are as susceptible as hit counts to site design decisions that have nothing to do with actual usage. As Mike Davidson brilliantly analyzed in April, part of the reason MySpace drives such an amazing number of pageviews is because their site design is so terrible.

As Mike writes: "Here's a sobering thought: If the operators of MySpace cleaned up the site and followed modern interface and web application principles tomorrow, here's what the graph would look like:"

Mike assumes a certain amount of Ajax would be involved in this more-modern MySpace interface, which is part of the reason for the pageview drop. And, as the Kiko guys wrote in their eBay posting, their pageview numbers were misleading because the site was built with Ajax. (Note: It's really easy to track Ajax actions in Google Analytics for your own edification.)

But Ajax is only part of the reason pageviews are obsolete. Another one is RSS. About half the readers of this blog do so via RSS. I can know how many subscribers I have to my feed, thanks to Feedburner. And I can know how many times my feed is downloaded, if I wanted to dig into my server logs. But I don't get to count pageviews for every view in Google Reader or Bloglines or LiveJournal or anywhere else I'm syndicated.

Another reason: Widgets. The web is becoming increasingly widgetized - little bits of functionality from one site are displayed on many others. The purveyors of a widget can track how many times their JavaScript or Flash file is loaded elsewhere - but what does that mean? If you get a widget loaded in a sidebar of a blog without anyone paying attention to it, that's not worth anything. But if you're YouTube, and someone's watching a whole video and perhaps even an ad you're getting paid for, that's something else entirely. But is it a pageview?

Pageviews were never a great measure of popularity. A simple JavaScript form validation can easily cut down on pageviews (and save users time), while a useless frameset can pump up your numbers. But with the proliferation of Ajax, RSS, and widgets, pageviews are even more silly to pay much attention to—even as we're all obsessed with them.

It's about Time

So what's a better measurement? Good question. Like many good questions, the answer is "it depends." If you're talking about what's important to pay attention to on your own site, you have to determine what your primary success criteria are and measure that as best you can. For some sites, that could be subscribers, or paying users, or revenue, or widgets deployed, or files uploaded, or what have you. It may even be pageviews.

At Blogger, we determined that our most critical metric was number of posts. An increase in posts meant that people were not just creating blogs, but updating them, and more posts would drive more readership, which would drive more users, which would drive more posts. Of course, posts alone wasn't our only measure, because someone could have written an automated posting script to fill up our database (which some did), and by that metric, we're happy about it. So we paid attention to pageviews and posts per user and user drop off, and other things.

Of course, we all want to know how we're doing compared to other people/sites/companies, so internal metrics aren't enough. And things like Media Metrix and Alexa are paid attention to by investors, and advertisers, and acquirers, and the press. So some apples-to-apples comparison is useful. If I had to pick one, in addition to unique visitors, I'd say time spent would be much more useful than pageviews.

After all, everyone's competing for a bigger share of the one scarce resource, which is people's attention (although it is a growing resource, because people keep making babies, and those babies keep getting Internet connections). More or less, what you need is people's attention before you can meet whatever goals you have.

Time spent interacting with a site is a much better basis on which to compare sites' relative ability to capture attention/value than pageviews is. When it comes to media like audio or video, an increasing percentage of the web consumption, time obviously means a great deal more than a pageview.

However, time is a bit harder to measure. HTTP, being stateless, doesn't actually have a concept of time spent. If you read this whole post and then click off to another site, my web server won't know whether you were here for five minutes or five seconds. However, most web analytics packages do estimate time spent, as does Media Metrix. (The Alexa toolbar could actually measure it even better.)

Widgets are still a bit tricky, because a user may or may not be paying any attention to a widget that's on a page they're viewing. If you could measure time spent interacting with a widget (or media being streamed through the widget), that would be ideal. RSS consumption is harder to measure by time, but there are other efforts to measure attention in that realm.

Finally, there's a big argument against time as a measure: People don't spend much time on Google search, because it gives them what they want so fast, and they go away. Which is obviously good for them and for users. Of course, Google doesn't drive many pageviews per visit either, but it's so good people return again and again. So aggregate time is probably still high. But just as pageviews can be gamed, you can slow your users down unecessarily (or accidentally because your servers are too slow) and increase time spent. In the long run, this is going to be bad for you, but it would screw up a market that paid too much attention to time spent, just as much as BS pageview counts do now.

In summary, there's no easy solution. There's a big opportunity (though very tough job) for someone to come up with a meaningful metric that weighs a bunch of factors. But no matter what, there will come a time when no one who wants to be taken seriously will talk about their web traffic in terms of "pageviews" any more than one would brag about their "hits."

This essay appeared first August 25, 2006, on Evan Williams' own website, www.evhead.com, which has been up since 1996 and in blog form since 1999. It appears in the "i-Technology Viewpoint" series with the author's permission. Copyright © Evan Williams, 2006  [The SYS-CON Media copyright tag on this page (below) does not apply and is invalid in this case, appearing merely as a default part of the frame of this page. SYS-CON Editorial.]

More Stories By Evan Williams

Evan Williams io CEO of Odeo, Inc., a startup based in San Francisco, where he has lived since 1998. Previously, he was co-founder and CEO of Pyra Labs, makers of Blogger, now part of Google, where he worked most of 2003-04. Originally from the cornfields of Nebraska, he currently lives in San Francisco with his fiancée (and their cat). You can find more about him on the web - but, says Ev, "don't believe everything you read!" His blog can be be found at www.evhead.com.

Comments (8) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

Most Recent Comments
Greg Deckler 09/19/06 01:01:13 PM EDT

The ad and unique users is what is important, not the pageview or any other metric. Hits are still useful to administrators for determining and scaling sever load. Use the webit instead.

n d 09/07/06 08:36:00 PM EDT

Pageview counts are as susceptible as hit counts to site design decisions that have nothing to do with actual usage. That, argues Evan Williams, is part of the reason MySpace drives such an amazing number of pageviews: it's because their site design is so terrible. So what's a better measurement?

n d 09/07/06 08:35:11 PM EDT

Pageview counts are as susceptible as hit counts to site design decisions that have nothing to do with actual usage. That, argues Evan Williams, is part of the reason MySpace drives such an amazing number of pageviews: it's because their site design is so terrible. So what's a better measurement?

n d 09/07/06 04:48:24 PM EDT

Pageview counts are as susceptible as hit counts to site design decisions that have nothing to do with actual usage. That, argues Evan Williams, is part of the reason MySpace drives such an amazing number of pageviews: it's because their site design is so terrible. So what's a better measurement?

Don Babcock 09/06/06 09:45:21 AM EDT

The article defeats its own title. It seems to clearly demonstrate that pageviews are a poor general metric regardless of RSS and/or AJAX.

But it seems also to set up a straw man of sorts. It suggests that there is yet some "holy grail" in the form of an abstract metric when the very idea of an abstract metric is a contradiction in terms. Supposedly, you could derive some magical "score" from a formula that combined a numnber of different factors and coefficients in such a way that sites could then be compared in terms of the metric. This is needless complexity in a quest for bragging rights in the sense of "mine's better than yours because it scores higher on the so-and-so scale." Managers continue to fall for such nonsense measures (like Microsoft Certified Professional, et al.) The more perfect a measure is in the general case, the less usable it is in the specific case. But it is almost always the specific case in which the real questions are framed. So if I'm interested in reach for questions of advertising efficacy there are ways to measure that. If I'm interested in site "participation" there are ways to measure that. But let's abandon the idea that there is some sort of all around metric that we can derive that will be of any real use. It comes down to the same kinds of questions you face in hiring. Is this particular candidate suitable for the job at hand? In most cases some general metric like passing an arbitrary exam won't tell you anything. Witness the ill-conceived idea of teacher testing.

I build websites for use in clinical research data gathering. They are very different from "typical" web fare. You have to evaluate each site in light of the purpose for which it was created. Evaluating the worth of a particular site isn't hard. Evaluating the fitness of a particular candidate for a particular job isn't hard either. You just have to do the work. It's when you try to develop "shortcuts" in the form of general metrics that you make the straightforward something complex. It's nothing more than the "silver-bullet" anti-pattern in management in yet another form.

MattHawk 09/01/06 02:55:21 AM EDT

Google's design is lightweight. MySpace does not even pretend to suffer from this convenience. Google might very well account for more unique visitors, but MySpace makes up for the visitors by having each page view result in a significantly greater amount of bandwidth usage.

Not to mention, if Google is working in its optimum capacity, it minimizes page views — if you only load the front page, and then find what you're looking on in the first page of search results, it doesn't generate many page hits."

Matthew Rothenberg 08/31/06 09:55:33 PM EDT

Pageviews aren't just a poor technical solution to measuring engagement due to rich interfaces - they're also a fundamentally flawed way of looking at user engagement on participatory media / social software websites.

What's the comparative value of a user who spends two hours a day on your site passively browsing various media content, versus a user who spends twenty minutes a week on the site, but spends that time uploading interesting content and/or actively commenting on other people's work? I'd argue the second user is not only more engaged but also more valuable to the website (and their advertisers) in the long-run.

Explication Please 08/31/06 09:15:36 PM EDT

So why is it again that MSN's Pageviews outstrip Google's by a fair margin?

@CloudExpo Stories
DX World EXPO, LLC, a Lighthouse Point, Florida-based startup trade show producer and the creator of "DXWorldEXPO® - Digital Transformation Conference & Expo" has announced its executive management team. The team is headed by Levent Selamoglu, who has been named CEO. "Now is the time for a truly global DX event, to bring together the leading minds from the technology world in a conversation about Digital Transformation," he said in making the announcement.
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that Conference Guru has been named “Media Sponsor” of the 22nd International Cloud Expo, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. A valuable conference experience generates new contacts, sales leads, potential strategic partners and potential investors; helps gather competitive intelligence and even provides inspiration for new products and services. Conference Guru works with conference organizers to pass great deals to gre...
DevOps is under attack because developers don’t want to mess with infrastructure. They will happily own their code into production, but want to use platforms instead of raw automation. That’s changing the landscape that we understand as DevOps with both architecture concepts (CloudNative) and process redefinition (SRE). Rob Hirschfeld’s recent work in Kubernetes operations has led to the conclusion that containers and related platforms have changed the way we should be thinking about DevOps and...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
Companies are harnessing data in ways we once associated with science fiction. Analysts have access to a plethora of visualization and reporting tools, but considering the vast amount of data businesses collect and limitations of CPUs, end users are forced to design their structures and systems with limitations. Until now. As the cloud toolkit to analyze data has evolved, GPUs have stepped in to massively parallel SQL, visualization and machine learning.
"Evatronix provides design services to companies that need to integrate the IoT technology in their products but they don't necessarily have the expertise, knowledge and design team to do so," explained Adam Morawiec, VP of Business Development at Evatronix, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. In his session at @BigDataExpo, Jack Norris, Senior Vice President, Data and Applications at MapR Technologies, reviewed best practices to ...
"ZeroStack is a startup in Silicon Valley. We're solving a very interesting problem around bringing public cloud convenience with private cloud control for enterprises and mid-size companies," explained Kamesh Pemmaraju, VP of Product Management at ZeroStack, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Large industrial manufacturing organizations are adopting the agile principles of cloud software companies. The industrial manufacturing development process has not scaled over time. Now that design CAD teams are geographically distributed, centralizing their work is key. With large multi-gigabyte projects, outdated tools have stifled industrial team agility, time-to-market milestones, and impacted P&L stakeholders.
Enterprises are adopting Kubernetes to accelerate the development and the delivery of cloud-native applications. However, sharing a Kubernetes cluster between members of the same team can be challenging. And, sharing clusters across multiple teams is even harder. Kubernetes offers several constructs to help implement segmentation and isolation. However, these primitives can be complex to understand and apply. As a result, it’s becoming common for enterprises to end up with several clusters. Thi...
"Infoblox does DNS, DHCP and IP address management for not only enterprise networks but cloud networks as well. Customers are looking for a single platform that can extend not only in their private enterprise environment but private cloud, public cloud, tracking all the IP space and everything that is going on in that environment," explained Steve Salo, Principal Systems Engineer at Infoblox, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventio...
In his session at 21st Cloud Expo, James Henry, Co-CEO/CTO of Calgary Scientific Inc., introduced you to the challenges, solutions and benefits of training AI systems to solve visual problems with an emphasis on improving AIs with continuous training in the field. He explored applications in several industries and discussed technologies that allow the deployment of advanced visualization solutions to the cloud.
The question before companies today is not whether to become intelligent, it’s a question of how and how fast. The key is to adopt and deploy an intelligent application strategy while simultaneously preparing to scale that intelligence. In her session at 21st Cloud Expo, Sangeeta Chakraborty, Chief Customer Officer at Ayasdi, provided a tactical framework to become a truly intelligent enterprise, including how to identify the right applications for AI, how to build a Center of Excellence to oper...
"IBM is really all in on blockchain. We take a look at sort of the history of blockchain ledger technologies. It started out with bitcoin, Ethereum, and IBM evaluated these particular blockchain technologies and found they were anonymous and permissionless and that many companies were looking for permissioned blockchain," stated René Bostic, Technical VP of the IBM Cloud Unit in North America, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventi...
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
"Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In th...