MySQL

The world's most popular open source database

Contact a MySQL Representative


  • MySQL.com
  • Developer Zone
  • Partners & Solutions
  • Customer Login
  • DevZone
  • Downloads
  • Documentation
  • Articles
  • Forums
  • Bugs
  • Forge
  • Blogs
 
  • You are currently browsing the archives for the Uncategorized category.

  • Pages

    • About
    • Find and store the error return value in procedures or functions
  • Archives

    • August 2009
    • July 2009
    • June 2009
    • May 2009
    • April 2009
    • March 2009
    • February 2009
    • January 2009
    • December 2008
    • November 2008
    • October 2008
    • August 2008
    • July 2008
    • June 2008
    • May 2008
    • April 2008
  • Categories

    • MySQL 5.1 Features (3)
    • MySQL 5.4 New Features (2)
    • MySQL 6.0 New Features (5)
    • MySQL 6.x New Features (5)
    • News (8)
    • Personal Opinion (4)
    • Tiny Tweaks (10)
    • Uncategorized (19)



New Features In MySQL 6.x

Archive for the ‘Uncategorized’ Category

Next Entries »

Globalization tasks (Part 1 of 3)

Wednesday, April 1st, 2009

For today and tomorrow and the day after tomorrow, I (Peter Gulutzan) will say what the status is for “Globalization” tasks. These tasks meet Sun requirements for software; you could say that they’re moving forward due to Sun’s acquisition. I’ll give examples where the code is already working; for other cases I’ll just say what’s in the description in the worklog task.

WL#751 Error message construction
Current status: architecture review done
Version = 6.1
We’ll produce error messages taking into account the current character set and the user’s choice of language. Expect changes in errmsg.sys, errmsg.txt, internals documentation, string formatting, and @@character_set_results.

WL#897 Accept SQL statements written with UCS2, UTF16, UTF32
Status: not passed architecture review
Version = 6.x
Currently clients pass messages in UTF8, which can be converted to UCS2 or UTF16 or UTF32 by the server (it’s easy because Unicode character sets have the same repertoire). But let’s let clients pass messages in UCS2 or UTF16 or UTF32 from the start, possibly saving a conversion step.

WL#1287 REAL DIN-1 German collation
Status: not passed architecture review
Version = 6.x
Change handling of Sharp S when following DIN-1 (German dictionary) rules with certain character sets.

WL#1349 Use operating system localization to send it as a default client character set
Status: In code review
Version = 6.1
Thus utilities like ‘mysql’ will, instead of the compiled-in character set, try to figure it out via nl_langinfo, GetLocaleInfo, etc. I opposed this, but our support department thinks it will save them time.

Worklog WL#1820 Variant SJIS and UJIS Japanese Character Sets
Current status: not passed architecture review
Version = 6.1
Don’t be fooled into thinking that SJIS and UJIS always mean the same thing to everyone, there are several different encodings for yen sign, reverse solidus, overline and tilde. The weird but only practical solution is to add even more Japanese character sets, with differences only for those characters.

I’ll continue tomorrow.

Posted in Uncategorized | No Comments »

MySQL User Conference Welcomes Peter Gulutzan

Monday, March 23rd, 2009

It’s now clear that I (Peter Gulutzan) have a few things to do at the MySQL User Conference in Santa Clara California (Silicon Valley) between April 20 and April 23 2009.

Monday 1:30 pm: This isn’t certain, but I’ll try to attend the “Ask a Guru” session at MySQL Camp.

Wednesday April 22, 9:00 am: I will be one of five panelists in “The MySQL Roadmap Shootout”.

Wednesday April 22, 3:50 pm: My Muscovite colleague Konstantin Osipov and I will show off “New foreign keys in 6.1″.

Thursday April 23, 11:55 am: Konstantin and I will be at the MySQL Camp with “Test Drive MySQL 6.1″.

I’d be happy to participate in a Birds of a Feather (Bof) session, if I get the impression there’s lots of interest, in SQL Standards or Character Sets (especially Japanese) or, of course, Performance Schema. Speaking of Performance Schema, I’m hoping that you’ll see a few slides about it in other people’s talks (Kay Roepke’s, Mark Leith’s), and I’ll be able to demo it for a minute during the “Test Drive MySQL 6.1″ talk.

Posted in Uncategorized | 1 Comment »

MySQL Performance Schema (7)

Friday, February 13th, 2009

This is #7 in a series of blog postings about MySQL Performance Schema.

Seven is enough for now, I’ll try to wrap up this series of blog postings by replying to some of the questions and comments that I’ve seen recently.

Q: Will the same instrumentation be in all storage engines?

A: It’s too early to say. We have lots of instruments in Maria and MyISAM, a few in Falcon. A developer for another storage engine has shown interest. But technical issues exist and we’ll have to negotiate.

Q: Is instrumentation available for more than mutex and disk access events?

A: No. WL#2360 mentions other event categories but makes it clear that they are properly part of a subsequent worklog task. We are only guaranteeing that what is in WL#2360 will be available with first public release.

Q: When is first public release?

A: April 20 2009. This is the first day of the MySQL User Conference in Santa Clara (http://mysqluc.com). I regret I wasn’t able to secure a speaking slot, but I’ll be there, ready to attend a Birds of a Feather or talk in the corridors. Other MySQL folks have told me they’ll have things to say in their own talks, and I’ll blog about that when I’m certain.

Q: Shouldn’t we have heard earlier, after four years of work?

A: That’s a misunderstanding that’s probably due to a sentence in my earlier blog “I’m an architect so I did the original specification and most of the wrangling/wheedling/editing over the last four years to keep this target in our sights”. There were many objections to overcome, there was some customer discussion, and in between there were doldrums that could last for months. A speedup happened recently (I credit Sun Microsystems for this but the immediate cause
was that Marc Alff took an initiative). Sometimes this week I’ve felt that full disclosures like this blog series came too early, since of course they’ve stimulated people to express their impatience about waiting 10 more weeks.

Q: Is there something for memory, for locks, etc.?

A: Not now. What we do for WL#2360, and what we can most easily extend, is instrumentation of “events”. A look at memory structures or locks (including deadlocks) is instrumentation of “states”. We are able to show some state information, for example of open files, but sometimes state information is easier to acquire from a storage-engine memory, and for other things it will be easier to do more when we have upgraded our data dictionary, and that’s a separate project. Bottom line: not now, but we recognize that a Performance Schema is more than instrumentation.

Q: Is this open source?

A: Yes, it’s an open source cathedral. People who’ve read “The Tragedy Of The Bazaar” will doubtless remember that cathedrals were the largest community projects of a certain age, and will doubtless reflect that, when you’re trying to build something tall, you don’t succeed by stacking bazaar booths on top of each other.

Q: When will we hear more about this?

A: I’ll post in this blog again when events occur.

Posted in Uncategorized | 5 Comments »

MySQL Performance Schema (6)

Thursday, February 12th, 2009

This is #6 in a series of blog postings about MySQL Performance Schema.

A timer is a mechanism that gives some idea of how long an event takes. A platform is a CPU chip (or multiple chips) and an operating system. Different platforms have different timers available. MySQL searches for the best timers, picks one as the default, and allows users to change the default.

The platform I’m using while writing this is a single-CPU x86 1.6GHz laptop with Linux 2.6.18. I can find out what timers there are by looking at PERFORMANCE_TIMERS.

mysql> SELECT * FROM performance_schema.performance_timers;
+-------------+-----------------+------------------+----------------+
| TIMER_NAME  | TIMER_FREQUENCY | TIMER_RESOLUTION | TIMER_OVERHEAD |
+-------------+-----------------+------------------+----------------+
| CYCLE       |      1596965174 |                1 |              8 |
| NANOSECOND  |      1000000000 |                1 |           1345 |
| MICROSECOND |         1000000 |                1 |           1201 |
| MILLISECOND |             991 |                1 |           1284 |
| TICK        |             108 |                1 |           1091 |
+-------------+-----------------+------------------+----------------+
5 rows in set (0.12 sec)

PERFORMANCE_TIMERS is telling me that a timer named CYCLE has a frequency of 1,596,965,174 — in other words 1.6GHz but the calculation will come up with slightly different numbers each time I do a SELECT, and wildly different numbers if I step down the speed setting to save laptop power. The “resolution” is 1, which means essentially that the frequency is real (that’s not always the case, for example it’s common to see a nanosecond timer that in fact is going up by 1000 nanoseconds at a time). The overhead is 8 cycles, in other words (since there are 1.6 billion cycles per second) not much. Meanwhile the NANOSECOND timer has a frequency of 1 billion, a resolution of 1, and an overhead that’s (1345/8) 168 times greater than the CYCLE counter.

I can find out what timer I’m using by looking at SETUP_TIMERS.

mysql> SELECT * FROM performance_schema.setup_timers;
+------+------------+
| NAME | TIMER_NAME |
+------+------------+
| Wait | CYCLE      |
+------+------------+
1 row in set (0.00 sec)

What this is telling me is that MySQL chose a CYCLE timer. Given the precision and the overhead that’s obvious, eh? Well, I think so for this case, and I spent many late hours writing the tiny assembler snippets that have to work on all the major platforms that MySQL supports. (That was my modest contribution to the code and I depended a lot on examples in vendor manuals, everything else in Performance Schema is the work of Marc Alff with advice from many others.) But every user of Performance Schema must be aware of CYCLE’s quirks. In the next section I’m copying from another worklog task that I wrote some time ago, WL#2373 “Use cycle counter for timing”. The section specifically mentions the assembler instruction RDTSC which is specific for x86s, but one encounters similar considerations on other platforms too.

Bad things about RDTSC

- RDTSC doesn’t “serialize”. That is, if there is out-of-order execution, rdtsc might be processed after an instruction that logically follows it. (We can force serialization, but we won’t bother.) See:
“Q&A: RDTSC to measure performance of small # of FP calculations”
http://softwarecommunity.intel.com/isn/Community/en-US/forums/thread/30226599.aspx
This flaw is unimportant since we are trying to measure events that take much longer times.

- It is possible to set a flag which renders RDTSC inoperative. Somebody responsible for the kernel of the operating system would have to make this decision. For Windows and Linux, there’s no such problem (although CONFIG_X86_TSC_DISABLE exists).

- With a multi-processor arrangement, it’s possible to get the cycle count from one processor in thread X, and the cycle count from another processor in thread Y. They may not always be in synch. Each processor might have a different TSC value. This is especially noted for AMD multi-socket (as opposed to multi-core) systems. See:
“AMD TSC Drift Solutions in Red Hat Enterprise Linux”
http://developer.amd.com/article_print.jsp?id=92
“Future TSC Directions and Solutions”
http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/tsc.txt
“RDTSCP”
http://developer.amd.com/articles.jsp?id=92&num=5
“tsc timer related problems/questions”
http://kerneltrap.org/mailarchive/linux-kernel/2007/9/9/191506
But “Intel systems are normally all synchronized”. See:
“linux/arch/i386/kernel/tsc.c”
http://lxr.linux.no/linux/arch/i386/kernel/tsc.c#L329
Or the operating system may synchronize TSCs. For example “normally, Windows synchronizes the time stamp counters on all processors” (in special circumstances) (not Windows Server). On Linux, though, the apparent tendency is to check for synchronization but not force it. See:
“Measure Code Sections Using The Enhanced Timer”
http://softwarecommunity.intel.com/articles/eng/2589.htm
“x86: unify/rewrite SMP TSC sync code”
http://lwn.net/Articles/211051/
“Hardware Support and Directions for Windows Server”

http://download.microsoft.com/download/0/0/b/00bba048-35e6-4e5b-a3dc-36da83cbb0d1/ServerDirections.docx

Synchronizing may cause a counter to go backwards.

- Converting cycles to elapsed time is only reliable if a CPU always has the same cycle rate. That’s not true on a laptop, which might change speed to save power. And it’s not true with high-performance chips which might gear down if heating becomes dangerous. (Notice that elsewhere I count this as an argument in favour of RDTSC because such computers generally have slow gettimeofday().)
Variability does not exist on some recent processors. See:
“Intel secretly changes the rules”
http://www.x86-secret.com/?option=newsd&nid=846
“TSC and Power Management Events on AMD Processors”
http://lkml.org/lkml/2005/11/4/173
Microsoft describes the flaw as “not common”. See:
“SQL Server timing values may be incorrect when you use utilities or technologies that change CPU frequencies”
http://support.microsoft.com/kb/931279

- The implementor will have to write code for all the processors that MySQL fully supports. I have already done this, read the comments in the attached file rdtsc3.c. But in m attempt to be cautious I left a few gaps in the coverage.

So a cycle counter won’t be a wonderful solution for all timing situations. However, the defects are acceptable for WL#2360.

Change the timer

As I said in an earlier blog posting, the defects of CYCLE are just a cost of doing business for monitoring. Monitors might be on all the time. You just want them to be unobtrusive and in the background. In return, you accept that results can be a bit off always, and way off sometimes. Or, if you don’t accept it, say

UPDATE setup_timers SET timer_name = ‘MICROSECOND’; /* or NANOSECOND or MILLISECOND or TICK */

As long as you have the privileges, and as long as the timer you choose is really operative on the platform you’re on, and as long as you realize that PERFORMANCE_TIMERS tells you some timers are high-overhead timers, you can switch to using a timer that really is based on wall-clock time. Some people might like to do this when switch from “monitoring” to “diagnosis”.

“Picoseconds”

We wanted to show results in the same time unit, regardless of the timer. In an ideal world this time unit would look like a wall-clock unit and be reasonably precise, in other words microseconds. But to convert cycles or nanoseconds to microseconds we would have to do a DIVIDE twice for ever instrumentation. DIVIDE is expensive on many platforms. MULTIPLY is not expensive. Therefore we se MULTIPLY. Therefore the time unit is an integer multiple of the highest possible timer_frequency, using a multiplier that’s big enough to ensure there’s no major precision loss. Therefore the time unit is “picoseconds”. Trillionths of a second. I expect we’ll have a hard time explaining over and over that we’re aware the precision is spurious but, once again, we did it due to overhead. If we find that our decision was impractical in some way, we’ll change it.

Posted in Uncategorized | 1 Comment »

MySQL Performance Schema (5)

Wednesday, February 11th, 2009

This is #5 in a series of blog postings about MySQL Performance Schema.

I mentioned the PERFORMANCE_SCHEMA.SETUP_INSTRUMENTS table in an earlier post. Recently I saw a bug report where I could actually use it in a meaningful way. So today I can talk about SETUP_INSTRUMENTS and with a “realistic” example.

The table has three columns: NAME, ENABLED, TIMED.
mysql> SELECT column_name,column_type FROM information_schema.columns WHERE table_name=’setup_instruments’;

+-------------+------------------+
| column_name | column_type      |
+-------------+------------------+
| NAME        | varchar(128)     |
| ENABLED     | enum('YES','NO') |
| TIMED       | enum('YES','NO') |
+-------------+------------------+
3 rows in set (0.00 sec)

The NAME column corresponds to the NAME column in EVENTS_WAITS_CURRENT.
The ENABLED column is ‘YES’ to enable instrumenting, ‘NO’ to disable.
The TIMED column is ‘YES’ to enable timing, ‘NO’ to disable.
The ENABLED and TIMED columns are updatable by people with UPDATE privileges.

The table has two uses:
* It’s a list of all the instruments, with their current states.
* It’s a way to set “flag” values.

An early idea was that ENABLED and TIMED values would be variables and one would have to say
SET @@instrumentname1_enabled=’YES’,@@instrumentname2_enabled=’NO’, …;
But this just becomes too unwieldy when there are hundreds of instruments. Especially when we want to enable/disable groups of instruments. Also, we want changes to instruments to have immediate effect on all connections, and that doesn’t happen with flag variables that one changes with SET statements.

There are two ways to restrict the Niagara of events that come to EVENTS_WAITS_… tables. “Post-filtering”: This is the use of WHERE clauses when SELECTing, and we recommend it. “Pre-filtering”: This is the use of SET ENABLED=… clauses in SETUP_INSTRUMENTS, and we recommend it too, but only for a limited set of circumstances. (There are other ways to pre-filter, for example to focus only on particular users, but that’s not in WL#2360.)

The times when one might want to consider pre-filtering are:
* The instrumenting overhead is too high or you conclude all instrumenting is useless. As I indicated in a previous post, this doesn’t appear terribly likely to me. But Performance Schema is very dependent on the characteristic of the chip (or chips) and the operating system. Until it’s out in the field we don’t know where this might be necessary. In this case you just don’t turn anything on in the first place, or you turn everything off explicitly thus
UPDATE performance_schema.setup_instruments SET enabled = ‘NO’;
Alternatively, if it turns out that counting events is good but the timer is in the way, turn on all the instruments but without the timers thus
UPDATE performance_schema.setup_instruments SET enabled = ‘YES’, timed = ‘NO’;
* There’s a specific problem to diagnose, probably in single-user mode, and we want to focus on very few instruments. Remember that the size of EVENTS_WAITS_HISTORY_… tables is limited, so some events might disappear because all the other events crowd them out. In such cases post-filtering is not going to be good enough. To enable only one instrument, but disable all others, I prefer to use a single UPDATE statement with a CASE clause:
UPDATE performance_schema.setup_instruments SET enabled = CASE WHEN name LIKE ‘%things we want%’ THEN ‘YES’ ELSE ‘NO’ END;
But others might find it looks clearer to turn everything off, then turn one instrument on.
UPDATE performance_schema.setup_instruments SET enabled = ‘NO’, timed = ‘NO’;
UPDATE performance_schema.setup_instruments SET enabled = ‘YES’, timed = ‘YES’ WHERE name LIKE ‘%things we want%’;

Now let’s look at that “realistic” example. I’m not claiming it’s “useful” or even “welcome”, but it’s real in the sense that we’re looking at a situation that other people found in the field, which we thought we could shed some light on. It’s not a situation that we imagined for some test/demonstration purpose. It’s Bug #42649 “THR_LOCK_charset global mutex abused by InnoDB”. The scenario is: one of our Lithuanian support people noticed (perhaps with DTrace or strace or one of many operating-system-specific tools) that InnoDB is causing several applications of a mutex named THR_LOCK_charset. One of our Brazilian verifiers was able to confirm this with a debugger. We’ve commented: there’s another way to find this out now, in SQL, with SQL result tables, tied to particular statements. We can show with Performance Schema that the number of invocations for the THR_Lock_charset mutex is between 2 and 10 per INSERT, something which we do not see with other storage engines. In our opinion using Performance Schema is less unwieldy and more informative than other methods. Others will say otherwise. Decide for yourself.

This is all we did:

USE test
DELIMITER //
UPDATE performance_schema.setup_instruments
SET enabled = CASE WHEN name LIKE ‘%THR_LOCK_charset’ THEN ‘YES’ ELSE ‘NO’ END//
DROP TABLE IF EXISTS tb//
DROP PROCEDURE IF EXISTS p//
CREATE TABLE `tb` (
`col1` varchar(200) DEFAULT NULL,
KEY `col1` (`col1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8//
CREATE PROCEDURE p ()
BEGIN
DECLARE iteration_number INT DEFAULT 1;
DECLARE last_timer_start BIGINT;
WHILE iteration_number <= 30 DO
SET last_timer_start = (SELECT MAX(timer_end) FROM performance_schema.events_waits_current);
INSERT INTO tb VALUES (REPEAT(’a',200));
SELECT iteration_number, event_id, source, timer_wait
FROM performance_schema.events_waits_history
WHERE timer_start > last_timer_start;
SET iteration_number = iteration_number + 1;
END WHILE;
END//
CALL p()//

Explanation: the above code disables every instrument except the one associated with THR_LOCK_charset, then creates a procedure which says INSERT thirty times. Each time, it displays which iteration it’s on, and the events which have happened since the last INSERT, including the source line number and the elapsed time.

The following is a sample run taken today with SUSE 10 x86-32. The results should be repeatable on other machines but of course the values in the TIMER_WAIT column (which is in “picoseconds”) will vary.

The rest of this blog article is a sample run.

linux:/usr/local/mysql/var # /usr/local/mysql/bin/mysql --user=root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 6.0.10-alpha-perfschema Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> USE test
Database changed
mysql> DELIMITER //
mysql> UPDATE performance_schema.setup_instruments
    -> SET enabled = CASE WHEN name LIKE '%THR_LOCK_charset' THEN 'YES' ELSE 'NO' END//
Query OK, 238 rows affected (0.00 sec)
Rows matched: 239  Changed: 238  Warnings: 0

mysql> DROP TABLE IF EXISTS tb//
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> DROP PROCEDURE IF EXISTS p//
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> CREATE TABLE `tb` (
    ->   `col1` varchar(200) DEFAULT NULL,
    ->   KEY `col1` (`col1`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8//
Query OK, 0 rows affected, 1 warning (0.09 sec)

mysql> CREATE PROCEDURE p ()
    -> BEGIN
    ->   DECLARE iteration_number INT DEFAULT 1;
    ->   DECLARE last_timer_start BIGINT;
    ->   WHILE iteration_number <= 30 DO
    ->     SET last_timer_start = (SELECT MAX(timer_end) FROM performance_schema.events_waits_current);
    ->     INSERT INTO tb VALUES (REPEAT('a',200));
    ->     SELECT iteration_number, event_id, source, timer_wait
    ->     FROM performance_schema.events_waits_history
    ->     WHERE timer_start > last_timer_start;
    ->     SET iteration_number = iteration_number + 1;
    ->     END WHILE;
    ->   END//
Query OK, 0 rows affected (0.00 sec)

mysql> CALL p()//
+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|                1 |      131 | charset.c:529 |      44446 |
|                1 |      132 | charset.c:529 |      44446 |
|                1 |      133 | charset.c:529 |      44446 |
|                1 |      134 | charset.c:529 |      44446 |
|                1 |      135 | charset.c:529 |      44446 |
|                1 |      136 | charset.c:529 |      44446 |
|                1 |      137 | charset.c:529 |      44446 |
|                1 |      138 | charset.c:529 |      44446 |
|                1 |      139 | charset.c:529 |      44446 |
|                1 |      140 | charset.c:529 |      44446 |
+------------------+----------+---------------+------------+
10 rows in set (0.00 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|                2 |      141 | charset.c:529 |     209084 |
|                2 |      142 | charset.c:529 |      74494 |
+------------------+----------+---------------+------------+
2 rows in set (0.02 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|                3 |      143 | charset.c:529 |     826946 |
|                3 |      144 | charset.c:529 |      68234 |
|                3 |      145 | charset.c:529 |      55088 |
+------------------+----------+---------------+------------+
3 rows in set (0.03 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|                4 |      146 | charset.c:529 |     293594 |
|                4 |      147 | charset.c:529 |      68234 |
|                4 |      148 | charset.c:529 |      55088 |
|                4 |      149 | charset.c:529 |      58218 |
+------------------+----------+---------------+------------+
4 rows in set (0.05 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|                5 |      151 | charset.c:529 |      75746 |
|                5 |      152 | charset.c:529 |      55088 |
|                5 |      153 | charset.c:529 |      55088 |
|                5 |      154 | charset.c:529 |      55088 |
|                5 |      150 | charset.c:529 |     632886 |
+------------------+----------+---------------+------------+
5 rows in set (0.06 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|                6 |      155 | charset.c:529 |     267302 |
|                6 |      156 | charset.c:529 |      85762 |
+------------------+----------+---------------+------------+
2 rows in set (0.08 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|                7 |      157 | charset.c:529 |     639146 |
|                7 |      158 | charset.c:529 |      68234 |
+------------------+----------+---------------+------------+
2 rows in set (0.09 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|                8 |      159 | charset.c:529 |     223482 |
|                8 |      160 | charset.c:529 |      83258 |
+------------------+----------+---------------+------------+
2 rows in set (0.11 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|                9 |      161 | charset.c:529 |     701746 |
|                9 |      162 | charset.c:529 |      74494 |
+------------------+----------+---------------+------------+
2 rows in set (0.12 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               10 |      163 | charset.c:529 |     283578 |
|               10 |      164 | charset.c:529 |      85762 |
+------------------+----------+---------------+------------+
2 rows in set (0.14 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               11 |      165 | charset.c:529 |     595326 |
|               11 |      166 | charset.c:529 |      70738 |
+------------------+----------+---------------+------------+
2 rows in set (0.15 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               12 |      167 | charset.c:529 |     607846 |
|               12 |      168 | charset.c:529 |      68234 |
+------------------+----------+---------------+------------+
2 rows in set (0.16 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               13 |      169 | charset.c:529 |     262294 |
|               13 |      170 | charset.c:529 |      85136 |
+------------------+----------+---------------+------------+
2 rows in set (0.18 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               14 |      171 | charset.c:529 |     551506 |
|               14 |      172 | charset.c:529 |      74494 |
+------------------+----------+---------------+------------+
2 rows in set (0.19 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               15 |      173 | charset.c:529 |     443208 |
|               15 |      174 | charset.c:529 |      86388 |
+------------------+----------+---------------+------------+
2 rows in set (0.20 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               16 |      175 | charset.c:529 |     639146 |
|               16 |      176 | charset.c:529 |      68234 |
+------------------+----------+---------------+------------+
2 rows in set (0.22 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               17 |      177 | charset.c:529 |     405022 |
|               17 |      178 | charset.c:529 |      68234 |
+------------------+----------+---------------+------------+
2 rows in set (0.23 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               18 |      179 | charset.c:529 |     526466 |
|               18 |      180 | charset.c:529 |      68234 |
+------------------+----------+---------------+------------+
2 rows in set (0.25 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               19 |      321 | charset.c:529 |      55088 |
|               19 |      322 | charset.c:529 |      55088 |
|               19 |      323 | charset.c:529 |      55088 |
|               19 |      324 | charset.c:529 |      55088 |
|               19 |      325 | charset.c:529 |      55088 |
|               19 |      326 | charset.c:529 |      55088 |
|               19 |      317 | charset.c:529 |      55088 |
|               19 |      318 | charset.c:529 |      55088 |
|               19 |      319 | charset.c:529 |      55088 |
|               19 |      320 | charset.c:529 |      55088 |
+------------------+----------+---------------+------------+
10 rows in set (0.26 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               20 |      327 | charset.c:529 |     411282 |
|               20 |      328 | charset.c:529 |      83258 |
+------------------+----------+---------------+------------+
2 rows in set (0.28 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               21 |      329 | charset.c:529 |     614106 |
|               21 |      330 | charset.c:529 |      74494 |
+------------------+----------+---------------+------------+
2 rows in set (0.29 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               22 |      331 | charset.c:529 |     339292 |
|               22 |      332 | charset.c:529 |      84510 |
+------------------+----------+---------------+------------+
2 rows in set (0.31 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               23 |      333 | charset.c:529 |     589066 |
|               23 |      334 | charset.c:529 |      68234 |
+------------------+----------+---------------+------------+
2 rows in set (0.32 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               24 |      335 | charset.c:529 |     197190 |
|               24 |      336 | charset.c:529 |      83258 |
+------------------+----------+---------------+------------+
2 rows in set (0.34 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               25 |      337 | charset.c:529 |     833206 |
|               25 |      338 | charset.c:529 |      72616 |
+------------------+----------+---------------+------------+
2 rows in set (0.35 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               26 |      339 | charset.c:529 |     632886 |
|               26 |      340 | charset.c:529 |      68234 |
+------------------+----------+---------------+------------+
2 rows in set (0.36 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               27 |      341 | charset.c:529 |     313626 |
|               27 |      342 | charset.c:529 |      68234 |
+------------------+----------+---------------+------------+
2 rows in set (0.38 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               28 |      343 | charset.c:529 |     589066 |
|               28 |      344 | charset.c:529 |      68234 |
+------------------+----------+---------------+------------+
2 rows in set (0.39 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               29 |      345 | charset.c:529 |     310496 |
|               29 |      346 | charset.c:529 |      85762 |
+------------------+----------+---------------+------------+
2 rows in set (0.40 sec)

+------------------+----------+---------------+------------+
| iteration_number | event_id | source        | timer_wait |
+------------------+----------+---------------+------------+
|               30 |      347 | charset.c:529 |     789386 |
|               30 |      348 | charset.c:529 |      68234 |
+------------------+----------+---------------+------------+
2 rows in set (0.42 sec)

Query OK, 0 rows affected (0.42 sec)

Posted in Uncategorized | No Comments »

MySQL Performance Schema (4)

Tuesday, February 10th, 2009

This is #4 in a series of blog postings about MySQL Performance Schema.

What does the new PERFORMANCE SCHEMA cost?

In my last three postings, I’ve shown shown lots of monitoring / diagnostic new features, and made it obvious there’s a lot more. At this point any wise viewer will wonder whether such a marvel can really be free. Well, it is, in the two senses of the word: “free like GPL” and “free like borshch”.

The “free like GPL” sense is easy to explain. Anybody can download the source as soon as we put it in the public trees, i.e. “soon”. Naturally Enterprise people can expect far more besides the server code (more about that later), but everybody gets the server code.

The “free like borshch” phrase evokes an East European beet soup (see http://en.wikipedia.org/wiki/Borscht) which is cheap enough that a cook can just throw it in with the rest of the dinner for no visible extra charge. The overhead of Performance Schema is low according to some early tests.
Disclaimer #1: these are not in-the-field tests, nothing is in the field.
Disclaimer #2: the feature is still being expanded and fixed constantly.
Disclaimer #3: I’m biased and my current role is as a tout.
Nevertheless I think people will wonder so I spew what I’ve swallowed so far.

It seems it should be way way less than 1 microsecond per instrumentation. The instrumenting code itself is currently taking way less than 1000 cycles, which was our initial goal — if it had taken more, we would have abandoned most of the summaries. This measurement is outside the server in loops, which means it evades caching or contention effects. So I call it “theoretical”. I can’t provide something more accurate, because the measurer doesn’t measure itself. As the old Latin phrase goes: “Quis custodiet ipsos custodes?” Who will watch the watchers? Nobody.

There are no mutexes.
Well, I should say for full disclosure that at time of writing there’s a single mutex having to do with one of the file-io summary counters, but we regard that as a bug, it will disappear. This means that wrong results will appear often enough that everyone will notice, and we’ll say “we regret that but that’s a cost of doing business in a low-overhead world”.

The default timer is a cycle counter.
It’s possible for a user to override the default and ask for timings in (say) microseconds instead of cycles, thus avoiding all the obscure gotchas of using a cycle counter (which I’ll talk about in excruciating detail in a later blog post). But users who do that will notice severe performance degradation over the long term.
When they complain, we’ll say “”we regret that but that’s a cost of doing business in a low-overhead world”.

There is no disk io.
All output is to memory. We have considered using a trace or log file (see the “Trace” section in the WL#2360 description), and I recall well that one of the early objections to WL#2360 was that it gave low priority to tracing. We’ll see whether that becomes an issue in the future.

My own comparison of a non-perf with a perf.
On a 1,000,000-row MyISAM table, to create an index, I need 5.62 seconds using the main-tree code which doesn’t support performance schema yet, I need 5.19 seconds using the performance-schema-tree code with every instrument turned on. Whoops. Something’s wrong with the test, eh? Oh well, it doesn’t show things getting worse.

Kelly Long and DBT2.
Kelly’s another person who deserves an honourable mention for services to the Performance Schema cause. He’s done a lot of testing and performance-improvement work for Sun/MySQL out of his Colorado labs. One item that he did back in December 2008 looked like this:

DBT2 test by Kelly Long December 1 2008
                8      16      32      64     128     192
            -----   -----   -----   -----   -----   -----
IN          15096   18049   21259   20494   19886   19293
OUT         15680   21277   22183   21025   20474   19174
Overhead     3.72%  15.17%   4.17%   2.53%   2.87%  -0.62%

From sourceforge.net: “This Database Test Suite [DBT2] aims to create database workload test kits used to simulate heavy user loads for OLTP, Decision Support, and e-commerce database transactions.” Kelly did the test with AMD 8-way Dual Core. Looking at the column labelled ‘8′: With 8 simultaneous connections, with performance schema compiled in and all instruments on, MySQL was doing 15096 transactions per second. With performance schema not compiled in, MySQL was doing 15680 transactions per second. Calculating 100*(1-(15096/15680)) gives 3.72 which is the percentage overhead. It goes up and down as one increases the number of connections. Bizarrely, one of the measurements shows MySQL getting faster with Performance Schema in. Although effects like that will disappear, this looks okay so far. Some community contacts said they’d be happy with 20% if they could get complete statistics.

So we think we’ll be able to recommend to many/most users: turn it on most of the time. By default all performance schema instrumentation will be DISABLED, because it’s dangerous to make it a default when it’s new, we expect to encounter embarrassments. However, we don’t think performance considerations alone would justify turning Performance Schema off.

Posted in Uncategorized | 2 Comments »

MySQL Performance Schema (3)

Monday, February 9th, 2009

This is #3 in a series of blog postings about MySQL Performance Schema.

Today when I list the tables in PERFORMANCE_SCHEMA I see:

mysql> select table_name from information_schema.tables where table_schema = 'performance_schema';
+----------------------------------------------+
| table_name                                   |
+----------------------------------------------+
| cond_instances                               |
| events_waits_current                         |
| events_waits_history                         |
| events_waits_history_long                    |
| events_waits_summary_by_event_name           |
| events_waits_summary_by_instance             |
| events_waits_summary_by_thread_by_event_name |
| file_instances                               |
| file_summary_by_event_name                   |
| file_summary_by_instance                     |
| mutex_instances                              |
| performance_timers                           |
| processlist                                  |
| rwlock_instances                             |
| setup_actors                                 |
| setup_consumers                              |
| setup_instruments                            |
| setup_timers                                 |
+----------------------------------------------+
18 rows in set (0.01 sec)

The table count fluctuates. As WL#2360 says, we’ll have a hundred ‘fore we’re through. The ones that I see today are a good representation since the rest will all be variations on the same theme. Besides the foundational table EVENTS_WAITS_CURRENT which I illustrated in my last post, there are four main types: *_history_*, *_instances, *_summary_*, *setup_*.

The *_history_* tables EVENTS_WAITS_HISTORY and EVENTS_WAITS_HISTORY_LONG.
These have the same structure as EVENTS_WAITS_CURRENT but more rows.
EVENTS_WAITS_CURRENT has “the last event for each thread”.
EVENTS_WAITS_HISTORY has “the last 30 events”.
EVENTS_WAITS_HISTORY_LONG has “the last 10000 events”.
Numbers such as ‘30′ and ‘10000′ are arbitrary and fixed at compile time; experience will tell us whether we need to adjust them.
Example selection:

mysql> SELECT DISTINCT thread_id, event_name FROM events_waits_history;
+-----------+----------------------------------------------+
| thread_id | event_name                                   |
+-----------+----------------------------------------------+
|         2 | wait/synch/mutex/mysys/THR_LOCK_malloc       |
|         2 | wait/synch/mutex/maria/PAGECACHE::cache_lock |
|         0 | wait/synch/mutex/mysys/THR_LOCK_malloc       |
|         0 | wait/synch/mutex/sql/LOCK_thread_count       |
|         1 | wait/synch/mutex/mysys/THR_LOCK_malloc       |
|         1 | wait/synch/mutex/sql/hash_filo::lock         |
|         1 | wait/io/file/sql/pid                         |
|         1 | wait/synch/mutex/mysys/THR_LOCK_open         |
|         1 | wait/synch/mutex/sql/LOCK_thread_count       |
|         1 | wait/synch/mutex/mysys/LOCK_alarm            |
+-----------+----------------------------------------------+
10 rows in set (0.28 sec)

The *_instances_* tables COND_INSTANCES FILE_INSTANCES MUTEX_INSTANCES RWLOCK_INSTANCES.
These are (relatively) static listings of event names and explanatory notes or statuses. They’re all tentative. I think we’ll present the information a different way eventually, but we’ll certainly have to present the information.
Example selection:

mysql> SELECT * FROM file_instances WHERE event_name LIKE '%maria%' LIMIT 5\G
*************************** 1. row ***************************
 FILE_NAME: ./performance_schema/#sql2-18b4-1.MAD
EVENT_NAME: wait/io/file/maria/data_file
OPEN_COUNT: 0
*************************** 2. row ***************************
 FILE_NAME: /tmp/#sql_18b4_0.MAI
EVENT_NAME: wait/io/file/maria/key_file
OPEN_COUNT: 0
*************************** 3. row ***************************
 FILE_NAME: ./performance_schema/#sql-18b4_1.MAD
EVENT_NAME: wait/io/file/maria/data_file
OPEN_COUNT: 0
*************************** 4. row ***************************
 FILE_NAME: /usr/local/mysql/var/maria_log_control
EVENT_NAME: wait/io/file/maria/Control
OPEN_COUNT: 1
*************************** 5. row ***************************
 FILE_NAME: ./mysql/time_zone_name.MYD
EVENT_NAME: wait/io/file/maria/data_file
OPEN_COUNT: 1
5 rows in set (0.00 sec)

The *_summary_* tables EVENTS_WAITS_SUMMARY_BY_EVENT_NAME, EVENTS_WAITS_SUMMARY_BY_EVENT_NAME, EVENTS_WAITS_SUMMARY_BY_INSTANCE, EVENTS_WAITS_SUMMARY_BY_THREAD_BY_EVENT_NAME, FILE_SUMMARY_BY_EVENT_NAME, FILE_SUMMARY_BY_INSTANCE.
Imagine there was an INFINITE history of all events since the server start, and imagine it was easy to do groupings or aggregations on it:
SELECT SUM|AVG|COUNT|MIN|MAX(column_name) FROM EVENT_WAITS_CURRENT_HISTORY_INFINITE GROUP BY column_name(s);
Now come back to earth. An infinite history table would be bigger than the real database. We’ve got to limit the *_history_* tables to 30 or 10000 rows.
However, we can make summarized data before we throw details away, without bloating. The *_summary_* tables are all, in effect, predefined permanent SELECT agg(),… GROUP BY … tables derived from EVENTS_WAITS_CURRENT_HISTORY_INFINITE.
Example selection:

mysql> SELECT * FROM events_waits_summary_by_thread_by_event_name ORDER BY sum_timer_wait DESC LIMIT 3\G
*************************** 1. row ***************************
     THREAD_ID: 2
    EVENT_NAME: wait/synch/mutex/mysys/THR_LOCK_malloc
    COUNT_WAIT: 110072
SUM_TIMER_WAIT: 71041916484
MIN_TIMER_WAIT: 425106
AVG_TIMER_WAIT: 645413
MAX_TIMER_WAIT: 2496981102
*************************** 2. row ***************************
     THREAD_ID: 2
    EVENT_NAME: wait/io/file/sql/FRM
    COUNT_WAIT: 322
SUM_TIMER_WAIT: 33564274067
MIN_TIMER_WAIT: 598785
AVG_TIMER_WAIT: 104236875
MAX_TIMER_WAIT: 4181863653
*************************** 3. row ***************************
     THREAD_ID: 2
    EVENT_NAME: wait/synch/mutex/maria/PAGECACHE::cache_lock
    COUNT_WAIT: 20197
SUM_TIMER_WAIT: 15792216396
MIN_TIMER_WAIT: 554895
AVG_TIMER_WAIT: 781909
MAX_TIMER_WAIT: 727091772
3 rows in set (0.01 sec)

The *setup_* tables SETUP_ACTORS, SETUP_CONSUMERS, SETUP_INSTRUMENTS, SETUP_TIMERS.
These are the tables that people will use to enable, disable, or adjust instrumentation. Unlike all the other PERFORMANCE_SCHEMA tables (which are strictly read-only), these tables have updatable columns which privileged users may change. Primarily they’re for “filtering in advance” so that specific actors or consumers or instruments are disabled, which eliminates clutter. The SETUP_TIMERS table is slightly different, it’s for choosing between the CYCLE, NANOSECOND, MICROSECOND, MILLISECOND, or TICK timers.
Example selection:

mysql> SELECT * FROM setup_instruments;
+----------------------------------------------------------+---------+-------+
| NAME                                                     | ENABLED | TIMED |
+----------------------------------------------------------+---------+-------+
| wait/synch/mutex/sql/LOCK_des_key_file                   | YES     | YES   |
| wait/synch/mutex/sql/PAGE::lock                          | YES     | YES   |
| wait/synch/mutex/sql/LOCK_sync                           | YES     | YES   |
| wait/synch/mutex/sql/LOCK_active                         | YES     | YES   |
| wait/synch/mutex/sql/LOCK_pool                           | YES     | YES   |
...
| wait/io/file/maria/index_tmp                             | YES     | YES   |
| wait/io/file/maria/iext                                  | YES     | YES   |
| wait/io/file/maria/LogDescriptorDirectory                | YES     | YES   |
| wait/io/file/maria/recovery_trace                        | YES     | YES   |
+----------------------------------------------------------+---------+-------+
239 rows in set (0.00 sec)

So by now it’s obvious what’s the benefit.
What’s the cost?
More in my next blog posting.

Posted in Uncategorized | No Comments »

MySQL Performance Schema (2)

Friday, February 6th, 2009

This is #2 in a series of blog postings about MySQL Performance Schema.

My colleague Mark Leith has shown some examples of output from two PERFORMANCE_SCHEMA summary tables on his blog http://www.markleith.co.uk/?p=112. I know we’ll be hearing more from him and I notice he’ll be giving a talk at our User Conference in April, so watch that space.

The specification of WL#2360 Performance Schema is now visible on our forge site http://forge.mysql.com/worklog/task.php?id=2360. I recommend you go there, click on “High Level Architecture”, read it, then come back to here. It should take you a while because this is the second-largest MySQL worklog-task specification since time began (the largest was WL#148 for foreign keys, http://forge.mysql.com/worklog/task.php?id=148). I’m an architect so I did the original specification and most of the wrangling/wheedling/editing over the last four years to keep this target in our sights. My colleague Marc Alff did almost all the serious coding, starting about nine months ago, after he did all the serious coding for the new SIGNAL feature. People deserving honourable mention include the architecture reviewer Mikael Ronstrom, principal engineer Sergei Golubchik, and early adopters like Mark Leith. I’ll talk in a later post about another “early adopter” group which will soon announce its plans based on Performance Schema.

Absorbed the specification? Then let’s move to looking at an unretouched photo of the result on my x86 Linux laptop. My illustration is the first SELECT that I did on the fundamental PERFORMANCE_SCHEMA table after starting a new server and client.
<pre>
mysql> SELECT * FROM performance_schema.events_waits_current LIMIT 1\G
*************************** 1. row ***************************
THREAD_ID: 2
EVENT_ID: 144
EVENT_NAME: wait/synch/mutex/mysys/THR_LOCK_malloc
SOURCE: safemalloc.c:183
TIMER_START: 40046326080026
TIMER_END: 40046326588338
TIMER_WAIT: 508312
SPINS: NULL
OBJECT_SCHEMA: NULL
OBJECT_NAME: NULL
OBJECT_TYPE: NULL
OBJECT_INSTANCE_BEGIN: 0×96aa620
NESTING_EVENT_ID: NULL
OPERATION: LOCK
NUMBER_OF_BYTES: NULL
1 row in set (0.00 sec)
</pre>
We make a new row every time we see an event. This is a monitoring “detail”. That’s the vital building block that distinguishes MySQL’s method. As any good
book about performance monitoring will tell you: you can get aggregates out of details, but you can’t get details out of aggregates. So voila, details.

The THREAD_ID value in this row doesn’t (yet) correspond to a value from SHOW PROCESSLIST, although that’s also available (it just happens to be  after the first row in
EVENTS_WAITS_CURRENT). We look at all threads, showing background jobs or delayed statements, so expect more than one-to-one correspondence here.

The EVENT_ID value is guaranteed to be unique, at least when taken in combination with THREAD_ID. Hidden meanings can be gleaned, but the essential purpose is to identify the row in sequence.

The EVENT_NAME value tells us that the most recent event for thread 2 was a “wait” i.e. something that might cause the application to wait, for a “synchronization object”, in fact a “mutex”, which is related to MySQL’s “mysys” code, and to be exact is “THR_LOCK_malloc”. (This happens to be an event that only occurs in debug mode.) Different mutexes have different names. Names show a hierarchy, which makes it easy to search for the most specific or the least specific, using WHERE with LIKE. We’re instrumenting both Mutexes and File IO. Mutex acquisitions tend to be more frequent than File IO.

The SOURCE value is where the code for the instrument is in MySQL’s source code. If we looked at line 183 in mysys/safemalloc.c (which we could, because this source is freely available), we’d see:
”
MYSQL_MUTEX_LOCK(&THR_LOCK_malloc);
”
Yes, that’s all there is to see. The instrumentation is technically not part of the module’s algorithms, it has no effect on what the routine is doing, and in fact there’s no change to the code here in safemalloc.c. We’ve just replaced the original code “pthread_mutex_lock(…)” with “MYSQL_MUTEX_LOCK”, which is a macro,
and the macro definition in include/mysql/mysql_mutex.h says MYSQL_MUTEX_LOCK is pthread_mutex_lock if you choose to build without performance schema. In that case, we’ve merely renamed. But if you choose to build with performance schema, the macro code in MySQL ’s include file effectively becomes
[ macro which causes "timer start" ]
actual mutex-locking instrumented code
[ macro which causes "timer end" calculation" ]
Those lines are the “instrumentation”.

TIMER_START, TIMER_END, and TIMER_WAIT show when the timer started, when it ended, and the difference (end minus start) which is how long a time elapsed between the instrumentation start and end. The TIMER_WAIT column tells us that the event took 508312 “picoseconds”. I put “picoseconds” inside quote marks because I’m avoiding any pretense that measurement can be so precise, or even that there’s always a relation to seconds on a real-world clock. The timer’s virtues are (a) very low overhead (b) good timings most of the time. But if anyone wants to make a view that divides TIMER_WAIT by a trillion to get what could be approximately the time in seconds (0.00005), that’s well and good.

The OPERATION is “LOCK”. Not a database-row lock in this case, but a “LOCK” because we’re acquiring a synchronization object.

Most of the other columns will be NULL for mutexes most of the time.

This is one of millions of rows. In fact it could be one of billions — one of the bugs that Marc fixed recently, which was due to my error, was too-small counter which overflowed after a session was up for a few days. These rows have a short life, we have to archive and destroy to save space. So most events pop into and out of existence without anybody noticing or caring.

But what if …

? The DBMS is “stuck”? Well, in that case the last event that has a timer start but not a timer end, and an ever-growing timer_wait value, is what’s it’s stuck on.

? You want to see whenever some job causes a particular event? Well, in that case, you can filter so that only that event appears, and there’s an in-memory history.

? You want to see aggregates (sums or counts or minima or maxima) of, for example, the number of times that thread 1 has accessed file tablespace_1, or the number of “picoseconds” that have been spent on reads of tablespace_1 by all threads?

Done. And I don’t mean “it can be done” or “all you need to do is write something”. I mean the information is (to coin a phrase) at your fingertips, always there. More about that on my next workday.

Posted in Uncategorized | 2 Comments »

MySQL Performance Schema

Thursday, February 5th, 2009

Today we’re taking the wraps off the best MySQL feature of 2009:
Performance Schema. It monitors multitudinous low-level server
events and provides them in tables inside a new “database”
called, surprisingly, PERFORMANCE_SCHEMA. So far …

* It instruments mutex and disk io calls in the server and
in one of the storage engines.
* It times precisely (to the nearest cycle or nanosecond or
microsecond or tick depending on user choice).
* It has tables for “current event” by thread
(what the job is waiting for if it’s blocked).
* It has tables for “history of events” by thread
(what were the last N mutex or file io activities, and where).
* It has summaries by thread
(how many fractoseconds did this job spend on io since it started).
* It has summaries by object
(how many times has somebody read from file X).

On any operating system, with really small overhead, continuously.

There’s a German saying “Erst Eier legen, dann gackern,
nicht umgekehrt” (first lay eggs, then cackle, not the other way around).
So we’ve been hesitant about publicizing something that’s not
publicly available yet. But it’s working, internal tests show
few significant difficulties, so today I got the order to remove
the “Private=Yes” field value for the worklog task. That means
you’ll soon see the specification on forge.mysql.com (it’s WL#2360).

I’ll be blogging more about this each day for the next several days.

Posted in Uncategorized | 3 Comments »

Next Entries »

New Features In MySQL 6.x is proudly powered by WordPress MU running on Blogs.mysql.com.
Entries (RSS) and Comments (RSS).