TV version (Display Regular Site)

Skip to: Navigation | Content | Sidebar | Footer


Weblog Entry

Browser Stats

December 10, 2003

Where do you go for accurate, up to date numbers for current browser share?

There aren’t many centralized resources aggregating this sort of thing on a wide scale anymore. Thanks to lazy developers who have written bad detection scripts over the past five years, plenty of browsers identify themselves as IE anyway, which makes tracking more of an art than a science.

One of my old monthly visits was thecounter.com’s browser stats, which stopped being updated in May. Due to the distributed nature of their code, I always considered this a fairly accurate cross-section of the web.

Chuck Upsdell is still going strong after just about 300 weekly editions. However, as always, his stats are culled from 4 different sources with completely different target audiences, and while may be enough to spot trends, aren’t as representational as they need to be for generalizing.

There are a handful of others that pop up after a quick Google, the most notable being an older page of Christina Wodtke’s. But nothing screams definitive to me.

Aside from sifting through logs on servers under your control, are there any other resources out there worth a bookmark?


Reader Comments

Dave S. says:
December 10, 01h

Does DAS offer aggregate statistics? I know it’ll track users for your own site, but I’m thinking more along the lines of displaying web-wide browser trends.

Ethan says:
December 10, 01h

Whoa, am I late to the party. TheCounter stopped updating?!

Well, crap.

December 10, 01h

I keep an eye on the Google Zeitgeist, though their monthly updates are often delayed (November 2003 statistics aren’t up yet), and they only provide a rough graph:

http://www.google.com/press/zeitgeist.html

Also, I watch the stats for the sites that my company designs and hosts.

Eric says:
December 10, 01h

One thing I always tell clients is to weight their own traffic more heavily than general statistics. They generally tend to be similar but if for some reason 10% of your userbase still supports a legacy browser like NN4 or a niche one like a Linux browser, the site should be designed with that in mind.

Erik S says:
December 10, 01h

I would think Google to be a decent source of stats considering how many people use it. I just wish that little browser graph was a little bigger!

6
Charles Roper says:
December 10, 01h

Dave, on the Reinvigorate site, the column on the right shows aggregate stats. I imagine you’d want something more in-depth than this though, no? Or have I missed the point and you’re after something totally different?

Dave S. says:
December 10, 02h

Actually, now that I actually visit the site those ReInvigorate graphs look pretty decent. 3-5 million hits a day works out to 100-150 mil per month, which is about a third of what thecounter.com was running at its peak. If they’re reporting roughly similar to how thecounter.com did it, that’s a reasonably good substitute.

Eric - excellent policy. Designing for your own audience is, of course, the best practice. Generalizing helps predict trends, but at the end of the day your site needs to cater to your user base.

Geof says:
December 10, 02h

The thought occurs that it would be interesting to just open up that part of your stats log publicly, and then let someone with some aggregation-fu put it all together.

jgraham says:
December 10, 02h

I seem to say this a lot in this type of discussion but, despite it’s huge popularity, I would expect google to be a worse source of stats than many other places. True, the huge visitor numbers mean that the random error should be low, but the systematic errors for google may be very large. Also, google is very much a ‘black box’; it’s not clear exactly how they are collecting and filtering their results. See these links, for example, for some further discussion of why I think that trusting google stats is likely to lead to incorrect conclusions:

http://golem.ph.utexas.edu/~distler/blog/archives/000233.html#c000228
http://forums.mozillazine.org/viewtopic.php?t=29690

I suppose if I had one of these newfangled weblog things (as opposed to a web page that occasionally gets new content in a reverse-chronological order), I could write that up and link to it. Hmm.

December 10, 04h

Back in Poland they have a wonderful resource called http://www.ranking.pl/ (browser stats are on http://www.ranking.pl/rank.php?stat=browAL ). It is very reliable, gathers data from six million sites, which is pretty big for Poland, and their customer service is actually very friendly (you can ask them for stats on specific subject – eg. only from Linux – and they’ll to generate them and send them to you).

Now, I don’t think many here will find it useful, but I just wanted to show it off :). And besides the page shows Opera 7 clocking in @ 1.8% ;)

December 10, 04h

the main problem i see with stats like thecounter is that - in my humblest of opinions - any self-respecting site with huge traffic will NOT be using some noob-ish off-site counter service…they’ll be running their own stats packages on their servers. so i’m always a bit concerned that this skews the results simply due to the fact that it’s mainly smaller sites that use those counters…
but yes, the most valuable stats: your own. sure, this creates a chicken/egg problem (e.g. my site is geared towards IE because my stats show low hits from other browsers, so i don’t need to change my code/css…or is the low number of hits from mozilla/firebird/opera/co. due to my crappy site only being coded for IE ?)

12
Justin French says:
December 10, 04h

I really like the idea Geof touched on, where 1000’s of site admin’s could upload their logs each month, and some bad-ass parsing could be done on the whole lot, giving a good indication of what’s going on with some real, live sites.

That’d be some seriouse processing and bandwidth (uploading) going on though. Seems to me that it couldn’t possibly be a free service.

Perhaps the processing could be done on the individual servers (get a common batch of parsers scripts Apache, IIS, etc), then only the *stats* could uploaded in XML or CSV.

Then only a nightly batch process of all the stats would be needed. Keeping it all confidential would be nice too.

If any one is keen, I’d be interested in helping out…

13
Brad says:
December 10, 05h

Did anyone else notice the number 3 browser at Re_Invigorate? Netscape 5?

Given that NS 5 never existed, does it perhaps refer to Mozilla/Gecko? It’s unclear.

Either way I’m uninclined to trust that source. Not that I’m offering anything better, mind you! I usually try and wring some sense out of Google’s Zeitgeist.

ste says:
December 10, 06h

Just a thought, but wouldn’t it be nice if Google shared their browser stats? I would suspect that being the search engine of choice would mean a fairly broad spectrum of users … anyone know if they provide such a service? Anyone got the connections to convince them to? ;)

Steve says:
December 10, 07h

Brad:

NS5 does indeed represent Gecko/Moz. Typing the following into my Mozilla 1.5 and NS7.02 address bar returns that info.

javascript:alert(navigator.appName + parseInt(navigator.appVersion));

Assuming thats how they determine it, of course.

Dave S. says:
December 10, 09h

Steve - Safari v1.0 also reports ‘Netscape5’ by that method. Interesting.

Niket says:
December 10, 10h

A large percentage of traffic to my site (10%) comes from NS4.7, largely due to buddies in my school who use the unix boxes.

On a project I just did (in Feb 03, stats as of Sept 03) for a Non-Profit, about 41% traffic comes from IE5.x and 57% from IE6. This is because most of the people accessing this site are volunteers. Except for myself (Opera identified as IE6) and another guy (Moz on Linux), all others are most likely using IE.

“Trust your site hits more than anything else”, and “Nothing is constant” is what I’ve learnt.

December 10, 10h

One of the things I don’t like about many stat “programs” is that they do their stats through JavaScript includes. This automatically leaves out any devices that doesn’t support JS or have it disabled.

Then again, if I was to use a JS-based method, I’d use it to my advantage to accurately identify browsers. Instead of relying on the oft-tweaked user agent, you could feature-check things specific to certain browsers. Just a thought.

19
Daniel says:
December 10, 11h

Do browser stats still matter? If you were to code a site as this one has been, it will work great in standards compliant browsers and degrade nicely in others.

What about reliable ‘browser window’ size stats? We all know there are screen resolution stats, but not everyone will view a web site with their browser maximized.

People might have the history side bar or whatever else is available open. Some people just use a smaller window.

Any suggestions or help here?

20
Josh S. says:
December 10, 12h

There is also:
http://www.reinvigorate.net/system/

It has some good stats, but I’m not sure how accurate they are because I think only geek-minded people run it. But it may have become more distributed since I used it…

December 11, 02h

Does anyone know if it’s possible to detect an Opera-browser which wants to be identified as IE? Otherwise Opera-stats are not to be trusted. (the number should be higher).

ppk says:
December 11, 02h

It is always possible to detect Opera, since its identification string always has ‘Opera’ in it.

Of course most writers of browser detect scripts don’t know that, because they don’t know how to read a browser string.

The real problem I have with any browser detect is that it’s impossible to know its methodology. And yes, many systems use a JavaScript browser detect, and a bad one, at that.

I was interested to see the reinvigorate link, I didn’t know it yet. Unfortunately the site doesn’t seem to discuss its methodology, either.

I wrote a bit about the basics of JavaScript browser detection at http://www.quirksmode.org/js/detect.html#string , and I suppose these points are also valid for browser detection in another language.

Some conclusions:

1) Use *only* navigator.userAgent, all other properties are untrustworthy. navigator.appName, in particular, should NEVER be used.

2) Even navigator.userAgent can be untrustworthy. Safari can spoof itself completely. As far as I know no other major browser can (Opera is always identifiable, if you know what you’re doing).

3) In any script, check the smaller browsers first, and IE last, after you’ve made sure that the browser isn’t any other one. This offers the best hope for reliable statistics.

4) navigator.userAgent is usually, but not always, the same as HTTP_USER_AGENT. AOL browsers in particular can have problems in this area, because in JavaScript they may announce themselves as IE plain and simple.

23
Smiler says:
December 11, 02h

Justin,

Whoever ran the service could write a script which the ‘clients’ downloaded. This script could parse the various servers logs and produce one small file with just the key numbers…

Then the host wouldn’t need to do anything other than read the browsers / hit numbers into a db. Et Voila… Little bandwidth used, all the processing done by the clients.

Chris W says:
December 11, 05h

Had a few headaches trying to set up my own stats system on my server using JS and PHP but found this site to be pretty useful.

http://tech.ratmachines.com/downloads/downloads.php

In response to PPK they seem to recommend sniffing for mozilla and NN4 last rather than IE.

December 11, 06h

Somebody mentioned ideal page widths. I always refer to this article http://hotwired.lycos.com/webmonkey/99/41/index3a_page2.html , it has an excellent breakdown by browser and takes into consideration the default chrome of the browser and any scroll-bars that may result from page content exceeding the current window size. Choice.

steve says:
December 11, 07h

navigator.appVersion is particularly misleading as it refers to the version of the rendering engine (correct me if i’m wrong) used by the browser and not the version of the browser itself. MSIE refers to itself as Mozilla/4.0, as does Opera - so as ppk said, don’t trust anything other than userAgent (but even that can be misleading)

A more reliable way to detect your browsers would be a combination of the user-agent string and DOM support - so, if the browser supports document.all, its going to be either MSIE or Opera, check the user-agent string to differentiate.

if !document.all && document.getElementById then its Moz/Gecko/Safari/Konquerer/etc, check additional DOM support and UA strings to be sure which is which, ie !document.getElementsByTagName(“*”) would be safari.

I’m only guessing that safari bit would work as I dont have access to one, but you get the idea - I’m getting off topic anyway and probably not telling anyone anything they didnt already know.

December 11, 12h

PPK: reinvigorate uses an almost unaltered version of your browser detection script ( see http://www.reinvigorate.net/archive/app.bin/jsinclude.php?-1 ). They give no credit for that part in the JS file itself, but I don’t know they do elsewhere.

Anyway, reinvigorate seems to be the most reliable source now considering its number of hits and the fact that we know how the browser/OS/screen/.. detection are done.

December 12, 02h

I’m a producer for http://www.multimap.com

We get nearly 4 million page views per day and collect the useragent for each. As yet, we’ve not done much with that particular data but your post, Dave, is the catalyst. I’ll try and pull out what I can from the data and present it publicly.

The bulk of our traffic comes during working hours and mostly from the UK, but should give a reasonable indication of browser usage. I’ll see if I can take two samples - one suring office hours and another during our evening peak (when people get home and go online). Could be interesting to see if there’s much of a difference.

ppk says:
December 12, 12h

Funny. Yes, you’re right, that’s my script.

Well, we can trust their methodology, then. <g>

Chris W, you’re right, we should check Netscape/Mozilla last. First the small fry, then IE, then Netscape.

padawan says:
December 14, 03h

Statmarket used to publish figures but now they seem to have put everything under a price tag. Like Reinvigorate, those stats are gathered through web beacons (from clients of WebSideStory in this case), not server logs, and are IMHO more accurate. My little finger tells me that Yahoo or Overture may soon be publishing figures as well, as they have started to tell clients of their hosted stat service KeyLime that they would aggregate their stats, most probably for the same reason.

In any case, as said before, always check the site’s own stats for specific populations. I’ve never seen a site perfectly match such or such aggregated figures.

31
dan says:
December 16, 03h

I understand everyones concern about browser stats. I have thought about it long and hard. My server comes with a wicked stats browser. But then I read…

“You cannot - as a web developer - rely only on statistics. Statistics can often be misleading.

Global averages may not always be relevant to your web site. Different sites attract different audiences. Some web sites attract professional developers using professional hardware, other sites attract hobbyists using older low spec computers.

Also be aware that many stats may have an incomplete or faulty browser detection. It is quite common by many web stats report programs, not to detect new browsers like Opera and Netscape 6 or 7 from the web log.

” at w3schools.com

in other words, well for me, forget about it.

Make it work in ie 5+, netscape 5+ and you are pretty safe. NN4…Just remember we are the masters… I’m not trying to be cocky but we are… Why did we stop using BETA and switched to VHS and now to DVD. Because the suppliers took control! You can still use your beta if you want but you won’t find the movies you want and thats’ no sweat off the movie industrie’s back. Let’s make it happen for the web, beside most of the nn4 stats probably come from “worried developers” testing out the site … :)

and that’s the last word…..lol

32
Bernard Farrell says:
December 26, 10h

What about Browser News at http://www.upsdell.com/BrowserNews/stat.htm? At least they’re careful to point out the danger of stats.

I am surprised that W3C hasn’t bothered to publish stats of it’s own. Their http://www.w3.org/WAI/GL/2001/01/22-stats.html page seems to be _very_ dated.