It seems that Scott is surprised by the number of unique user agents his server sees. So I decided to check mine:
mysql> select count(distinct(agent)) from access_jeremy_zawodny_com; +------------------------+ | count(distinct(agent)) | +------------------------+ | 15366 | +------------------------+ 1 row in set (30.01 sec)
Impressive. Roughly three times as many. I wonder which are most popular? Maybe the top 20?
mysql> select agent, count(*) as cnt from access_jeremy_zawodny_com -> group by agent order by cnt desc limit 20; ... Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705) Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Mozilla/4.0 (compatible; MSIE 6.0; Windows 98) Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) Googlebot/2.1 (+http://www.googlebot.com/bot.html) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705) Radio UserLand/8.0.8 (WinNT) Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; http://www.inktomi.com/slurp.html) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Q312461) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Q312461) Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt) Mozilla/4.0 (compatible; MSIE 5.5; Windows 98) NetNewsWire Lite/1.0.2 (Mac OS X) Mozilla/3.0 (compatible) Mozilla/3.01 (compatible;) Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90) Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; T312461)
Image if I ran that on the logs at Yahoo. Hmm. Maybe I should, just for a day. (No, not all the logs. Just a few servers.)
BTW, I love logging apache traffic directly into MySQL. It means I can do all sorts of cool stuff.
Posted by jzawodn at December 31, 2002 11:49 AM
Jeremy, what do you use to send your logs directly into MySQL? This would be a HUGE value to me.
Thanks!
Andy
mysql> select count(distinct(agent)) from access_log;
+------------------------+
| count(distinct(agent)) |
+------------------------+
| 4932 |
+------------------------+
1 row in set (1 min 28.32 sec)
Damn, my server is slow..
How about this.... I'll just blog the solution for you. :-)