Wow, it's been a busy week. I was totally swamped for several days dealing with the MySQL servers and related stuff. And then I used a day or two to recover (sleep, shower, etc).

Anyway, I made some interesting discoveries along the way. The most surprising one had to do with thread caching on Linux when you have a busy MySQL server--busy in a particular way, mind you.

You see, we had a single master server which all the web servers could connect to (using PHP) whenever someone made a change. That includes creating a tile (there were several hundred thousand tiles created), approves a tile, marks one as "cool", and so on. All told, the master was quite busy.

Because there were between 20 and 45 front-end web servers during that time, and each could have had up to 70 apache processes that might have needed to connect, we faced a problem. That meant that the master needed to handle up to 3,150 connections in the worst case (that's 45 x 70). Most of the PHP code used mysql_pconnect() to hold persistent connections.

Rather than worry about how to do that, I made sure that the wait_timeout was set to a very low value: 15 seconds. That means MySQL would close any connection that was idle for more than 15 seconds. But I didn't realize the extent of the problem until I started getting reports from the web servers that the master was refusing connections. Why? Because I had set the maximum number of connections to a reasonable value in the master's my.cnf file:

set-variable = max_connections=180
set-variable = max_user_connections=140

And at that time, the wait_timeout was set to 600 seconds (10 minutes). Clearly that was a problem. There were a lot of idle clients holding connections and blocking out new clients from connecting and getting real work done.

What to do?

We could have stopped using mysql_pconnect(), but as you'll see, that wouldn't have solved the underlying problem.

I needed to adjust the settings. But I wasn't sure what values to use. And I really didn't want to keep stopping and starting the master. That would just suck. Then I remembered that we were running MySQL 4.0.4. I'd has a new feature that allows you to change most of the server settings on on the fly without restarting! Read about here, it in the on-line manual.


All I needed to do was execute a few variations on this command:

SET GLOBAL wait_timeout=60;

(with different values in the place of "60") to try and strike a balance between letting new clients in and kicking out already connected users too quickly.

Ultimately, I settled on a timeout of 15 seconds.

But that had an interesting and unanticipated side-effect. It meant that the Linux server was having to create new threads (MySQL is a multi-threaded server) at a very high rate. That sucks up a measurable amount of CPU time.

How much CPU time? By the time I got around to looking at the output of SHOW STATUS and seeing this:

| Threads_cached           | 0          |
| Threads_created          | 270194     |
| Threads_connected        | 46         |
| Threads_running          | 28         |

Things were pretty bad. The machine had very little idle CPU time--probably 5-10% at the most. But it really wasn't doing that much work--maybe 40 queries per second. I was a bit puzzled. But that Threads_created number jumped out at me. It was high and increasing rapidly.

Luckily I remembered the thread_cache setting. So I decided to investigate (using the new syntax for examining server variables):

mysql> SELECT @@global.thread_cache_size;
| @@thread_cache_size |
|                   0 |
1 row in set (0.00 sec)

Uh oh. I never set the thread cache in my.cnf, so it has assumed the default. That's bad. It's like removing the pre-forking capabilities of Apache 1.3 and letting it get pounded on a busy web site. The "fork a new process for each new request" gets pretty expensive pretty quickly.


Luckily the thread cache is also tunable on the fly now. So all I had to do was this:

SET GLOBAL thread_cache_size=40;

I took a guess and figured that by caching 40 threads, we'd be saving a lot of work. And boy was I right!

In the other window, where I was running vmstat 1 I noticed a dramatic change. The idle CPU on the machine immediately went from 5-10% to 35-40%

If only I had thought of that sooner!

So the moral of the story is this: If you have a busy server that's getting a lot of quick connections, set your thread cache high enough that the Threads_created value in SHOW STATUS stops increasing. Your CPU will thank you.

I don't feel bad though. We were all going nuts to try and tune/optimize the code and servers while it was running and had very little sleep. Thread caching really wasn't the worst of our problems. But it became the worst after we had fixed all the bigger ones.

It was quite a learning experience.

Posted by jzawodn at September 14, 2002 04:42 PM

Reader Comments
# john said:

glad to see you back among the living Jeremy

on September 14, 2002 07:24 PM
# Steven Roussey said:

Yes, yes, yes. And get rid of pconnect while your at it. It is somewhat slower when the number of connections is at a slower rate, but helps keep up availablity when the server is under stress. Its a tradeoff.

At any rate, creating connections to MySQL is very fast. Faster still if there are cached threads. When you get rid of pconnect you can see how many connections are attached simultaneiously at any one time. If you have more than one apache/PHP server connected to the MySQL server, the effects can be geometric. Once you get a general feeling for how many are usually connected at one time, set the thread cache to that number.

Remember, long long ago (in other databases), connection times were REALLY slow. Connection pools could make a real difference. MySQL has very lightweight connections, so connection pooling (or simple persisting) does not matter that much.

In fact, they still can. But PHP on Apache is forked and there is no pooling. Oops. Some day this will get fixed. And the PHP documentation is wrong is some places on this subject. Has been for years.

We have tweaked our system over the years such that we only need two servers (one Apache/PHP and one MySQL). Forums are notoriously recursive, but we have been able to bring our queries per second down to 3,000 in the peak times.

on September 17, 2002 07:56 PM
# Ben Margolin said:

I second everything Steven Roussey says above. We learned a lot of these things at my previous job, the hard way, as well.

on March 13, 2003 01:49 AM
# speedmax said:

i run a lyrics site,

php/apache2/mysql on freebsd dual PIII server

the mysqld stay on top of the process list for a month already, 60k hits [i think its nothing compare to you big boys], but it take 100% of one cpu most of the time
i use pconnect() at all my php connection.

i set

now mysql goes like 5% of cpu.

i am happy with that. :)

on May 6, 2003 10:14 AM
# robin said:

i have a mysql+apache+php server running on redhat linux8.0,it has 1G ram .it work normal at begining ,last for 15 days,:)),but it turns slow suddently! i show status,it appears the follow:
| Threads_cached | 430 |
| Threads_created | 471 |
| Threads_connected | 41 |
| Threads_running | 40
at normal situation ,the threads_running value is 1 or 2,but it burst,and my server nearly crash,i cannot find out what the real problem is! when i restart mysql server ,all is normal now .but it last for several days and crash again!

my my.cnf set:

port = 3306
socket = /tmp/mysql.sock
user = mysql
set-variable = key_buffer=256M
set-variable = max_allowed_packet=1M
set-variable = max_connections=1000
set-variable = back_log=500
set-variable = max_connect_errors=100
set-variable = table_cache=512
set-variable = sort_buffer=1M
set-variable = record_buffer=1M
set-variable = thread_cache=500
set-variable = thread_concurrency=8
set-variable = myisam_sort_buffer_size=64M

and my wait_timeout value is the default set 28800 .

how should i adjust my variables? somebody help me!!! thanks first!

on May 21, 2003 10:21 AM
# brad said:

My Threads_created is at 71147112, i think i should really look into this :)

Poor old duron chokes a bit at busy times :)

on May 24, 2003 09:54 PM
# Dziugas said:

I would like to mention, that wait_timeout still does not work with FreeBSD 5.0-RELEASE-p2 (I will try 5.1 soon) and MySQL 4.0.13 compiled together with LinuxThreads.

As earlier, wait_timeout value is simply ignored (tried both my.cnf and SET GLOBAL wait_timeout).

on June 30, 2003 02:32 AM
# Dziugas said:

I would like to mention, that wait_timeout still does not work with FreeBSD 5.0-RELEASE-p2 (I will try 5.1 soon) and MySQL 4.0.13 compiled together with LinuxThreads.

As earlier, wait_timeout value is simply ignored (tried both my.cnf and SET GLOBAL wait_timeout).

on June 30, 2003 02:37 AM
# Troy said:

Jeremy, this blog post was a life saver for me, thread caching really does help out. I would like to post regarding the wait timeout not being obeyed, I'm also experiencing this on a Freebsd4.9 system with 3.23 and now 4.1.7 . The interactive and wait timeouts are set for 30 seconds, however every once and a while the timeouts are ignored and the connections fill up all available connections. Not sure why this is happening. I moved back to using mysql_connect instead of mysql_pconnect in hopes that I won't run into this problem again. Also turning off reverse DNS has helped a lot as there aren't a ton of connections waiting to connect in the process list anymore. my.cnf now looks like

set-variable = max_connections=3072
set-variable = max_connect_errors=10000
set-variable = wait_timeout=30
set-variable = interactive_timeout=30
set-variable = thread_cache=40
set-variable = back_log=500

on November 30, 2004 03:08 PM
# Herrin said:

This was a very useful post for me as well, I went from about 99% avg proc utilization to about 65% avg after making this change. Oddly enough it's MySql running on Windows, seems to have made a huge difference. Thanks!

on January 19, 2005 09:02 AM
# Felipe said:


I have the oposite problem, maybe you could know why my-mysql is running only one thread while i am sending a lot of querys from different users.

only one P4 processor, Fedora core 1, Apache2, PHP 4.3.9, Mysql 4.0.21, 512M ram. some tables with more than 100.000 records joined by those querys...

my.cnf (only [mysqld]):
port = 3306
socket = /tmp/mysql.sock
key_buffer = 384M
max_allowed_packet = 1M
table_cache = 64
sort_buffer_size = 2M
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 2M
myisam_sort_buffer_size = 8M
thread_cache = 8
query_cache_size = 32M
# Try number of CPU's*2 for thread_concurrency
thread_concurrency = 4
max_connections = 1024
max_user_connections = 1024

Querys executes quickly enough, but, I would like to use mysql concurrency. And I have test all adviced i found. %S

ok.. it's a personal matter.

Thanks you very much!

on February 24, 2005 04:22 AM
# Adis said:


on April 21, 2005 01:29 PM
# Adis said:

Hi, i got this problem.

My server keeps overloading cause of the MYSQL problems it`s having.

I cannot figure out what the problem is.. ! :/

So guys any one want to help me out. Im willing to pay!?


on April 21, 2005 01:30 PM
# erin said:

This blog was a big help with optimizing a webserver with a LOT of connections. The mysql host people won't allow pconnect so we aren't using that. Changed the variable threads_cache => 64 (from 0) and that has helped with a LOT of aborted connections.

I'm on a linux box -- can I set back_log to 96? (linus listen says max is 128 can I set it to 128?). Will this help reduce aborted connections? (approx. 12/hr with threads_cache=64, and about 30/hr with threads_cache=o).



on June 27, 2005 12:01 PM
# Yerodin said:

Jeremy, these may be "Some random bits scribbled by Jeremy Zawodny", but this particular random bit may have saved my skin big time.

I'm a part time developer, and I'm almost complete with the largest client controlled site I've ever done. The site has over 200 individual users with their own information stored, edited and displayed by the individual user.

I'm using a single Mysql query page to do multiple Inserts and another for Multiple updates.

Some users would get through while others did not.

I took your advice and had my hosting company adjust my @@global.thread_cache_size.

The number of "Threads_Created" went from 1,400,000 to 8.

WOW. Of course the entire group of 200 plus hasn't logged back in yet, but I remain Optimistic.

Your plethora of knowledge is a blessing to me.

Thanks for this information.

on April 23, 2006 06:29 AM
# Joy said:

I am running mysql 4.1.8-max-log linux Box. I have 8GB of RAM and Database size is 12 Gb. Maximum user through web browser is 200 at peak hours and min is 6 non peak hours.
To run a query it takes 300 to 700 seconds. My bigest table(t_ins) is 4.5Gb(and 8.9 million records, with a index on column pol_sys_id) and most of the queries are on this table(mst_pol) with a join on another table which has 3.9 million records. in Table mst_pol, pol_id is PRIMARY KEY. I have other indexes also on this table. But it does not use Primary Index while running query.
select finyr(pol_rep_dt),finmth(pol_rep_dt), (count(distinct t_ins.pol_id)) from t_ins, mst_polst where mst_pol.pol_type_flag="P" and rec_type="P" and mst_pol.pol_status in("A","E","C","R") and t_ins.pol_apprv_st="A" and t_ins.pol_endno=0 and pol_rep_dt"" and t_ins.pol_id=mst_pol.pol_id group by finyr(pol_rep_dt),finmth(pol_rep_dt);

Any Help Please.

on May 11, 2006 06:47 AM
# makro said:


Look at your query!
You reference pretty everything in your where / group by - clause, so the index becomes pretty useless for the RDBMS.

Read some good books on efficient DB design. UNDERSTAND when your RDBMS will use indices and when not.
With queries of this type your DB will never use an index.

practice, practice, practice!
design, develop, test, benchmark!



on July 6, 2006 11:31 AM
# Peufeu said:

If you wanted performance, why did you not use lighttpd talking to php via the fastcgi interface ?

I had a site which got about 100 hits/second, 90% being static files, the rest (about 10 hits/s) on PHP pages.

The server is a crap Celeron with 512 MB RAM, and a single IDE hard drive. The database is huge (about 150K forum posts) ; there are about 500 people connected at once hammering the forums, and the server runs a bittorrent tracker getting about 100-150 hits/s.

Using idiotic solutions like apache/mod_php had this server grind, it was slow as hell, it had like 50-100 apache processes filling up the RAM...

So I installed lighttpd, eaccelerator, php5_fastcgi and mysql-5.

The nice thing with lighttpd is that hits on static files are so fast they don't even count, and the number of concurrent connections is nearly unlimited. fastcgi limits the number of concurrent php processes and mysql connections to a reasonable value. It's very scalable.

The server went from very slow and unusable to super snappy. lighttpd uses 20 MBytes RAM and about 3% CPU. The machine is about 50% idle.

I think by using this solution you could get rid of at least half of your servers.

on August 31, 2006 06:50 AM
# till said:

This is a pretty good entry, I just wanted to add that one should "commit" those adjustments and put it into your my.cnf-file so they are back in place if you restart a server.

on December 7, 2006 06:43 AM
# pierre said:

hi all i have i online server that crash always it has
one database with 128 table it's for a university students
the architecture is composed from a loadbalancing 2 API 's server and one for the database.
the mysql server use the default configuration
i have a probleme also with the connection on the server are not closed did it became a great problemes or no?
i have 8 GB Ram smp for cpu and linux 2.6 el3 mysql 4.1.2
how can i benefits from my hardware.
and i any one could help to tell me how could i close the connection and where to do it and when?
with all respect

on December 21, 2006 06:50 AM
# said:

thank you very much.
I'm using this post now. :-)

on January 13, 2010 01:38 PM
# Vikram said:

Someboby please help me,

MySQL server is consuming 100% CPU
from "show processlist" we observed that 42 queries status is showing as "sorting";
we restarted the mysql server, but this situation happened again after 3 days, so looking for a permanenet solution for this
this is the query:
select * from AlarmCounter where Status = '0' AND Severity=4 ORDER BY Time DESC LIMIT 0, 1000

this is the query we run from our EMS every after 10 seconds

AlarmCounter is having 7000 records
the same dump we got and loaded on our local machines and running the same query with still less delay ,but never reproduced is this the query the root cause for this situation? or something else?

/etc/my.cnf is this:

other varibles are mysql defaults
we use myisam storage engine

can you suggest me any tuning params, if the query run on 7000 recrds is not an issue for every 10 sec, the record size may go up to 1,00,000

on February 4, 2010 12:46 AM
# said:

How can i print active threads using global variables in mysql??? if any one knows means please share with me....
Thank you very much...

on August 4, 2010 03:57 AM
Disclaimer: The opinions expressed here are mine and mine alone. My current, past, or previous employers are not responsible for what I write here, the comments left by others, or the photos I may share. If you have questions, please contact me. Also, I am not a journalist or reporter. Don't "pitch" me.


Privacy: I do not share or publish the email addresses or IP addresses of anyone posting a comment here without consent. However, I do reserve the right to remove comments that are spammy, off-topic, or otherwise unsuitable based on my comment policy. In a few cases, I may leave spammy comments but remove any URLs they contain.