Someone just asked how I log my web traffic into MySQL. The timing couldn't be better, because the article I wrote for Linux Magazine is now on-line so I don't need to explain it again: Getting a Handle on Traffic
For the impatient, go get mod_log_sql and have fun.:-)
Update: Cool. It seems that RootPrompt picked it up too.
Posted by jzawodn at December 31, 2002 02:43 PM
another alternative is to use apache's reliable piped logs feature (http://httpd.apache.org/docs/logs.html#piped) to pipe the log files through a process that inserts them into your database. you could probably even create a clever log format definition that just generated INSERT queries to feed directly into the mysql command line client, although there might be an issue with escaping problematic characters.
one advantage is that you then end up with only one process holding open a mysql connection, instead of every apache child.
Sheesh... And I thought talking on a celphone while driving was bad... :p ;)
Drat... That comment was supposed to be for the entry above... DOH
Heh - I pounded my head against the wall for a while by explicitly following the LinuxMag article. You use MySQLLoginInfo whereas I had to use LogSQLLoginInfo. I didn't realize it until I went back to the directives on grubbybaby.com and saw that it was different. Is it just because you're using an older version?
Yeah, a new version came out after I wrote that article. A lot has changed. :-(
The best thing to do is probably to point folks at the Changelog:
Since I'm not terribly keen on re-writing the article.
Ooops, one further step and the changelog exceeds your article ;-)
Cool article. One question though: what do you do if you get millions of hits per day?
I can think of two solutions, both involving logging data to multiple tables:
1)use the MySQL MERGE table type. On the surface, this looks like the cleanest way to go, but after doing a bit of research, I found that you need to specify all of the data tables explicitly when creating the MERGE table. It seems that this is mainly for creating a single fact table, and might not be the best for logging serial data.
2)Do the data partitioning on the application side, as refered to in this post:
Here, the tables are named serially or by a date, and you have a description table that has a record for each of these tables. This requires extra logic in the application to query each of these tables, but you won't have to run ALTER commands all of the time on the description table: you'd just insert a new record.
It doesn't look like mod_log_sql has support for this, but it looks like something that should be somewhat easy to do.
What are your thoughts on this? Your book's TOC shows that you will be writing 2 pages on this subject. Hopefully there'll be more? :)