MySQL

The world's most popular open source database

Contact a MySQL Representative


  • MySQL.com
  • Developer Zone
  • Partners & Solutions
  • Customer Login
  • DevZone
  • Downloads
  • Documentation
  • Articles
  • Forums
  • Bugs
  • Forge
  • Blogs
 
  • Pages

    • About
  • Archives

    • December 2008
    • November 2008
    • October 2008
    • August 2008
    • July 2008
    • June 2008
    • May 2008
    • April 2008
  • Categories

    • MySQL 5.1 Features (2)
    • MySQL 6.0 New Features (4)
    • MySQL 6.x New Features (1)
    • News (6)
    • Personal Opinion (4)
    • Tiny Tweaks (10)



New Features In MySQL 6.x

MySQL and the leap second on December 31 2008

December 1st, 2008

“Leap seconds” are starting to cause questions on our forums and in some bug reports because ‘2008-12-31 23:59:60′ will be the first leap second in the last 3 years. We have a short answer and a long answer.

The short answer is: it will only cause a problem if you
(
(
(a) arrange so your operating system returns the leap-second value for MySQL to see,
or
(b) load the appropriate ‘leap second’ tables as described in our manual,
)
and
(c) do something which uses the NOW() or CURRENT_TIMESTAMP at precisely midnight UTC on December 31 (remember your local time is not UTC time!).
)
or
use CONVERT_TZ() or FROM_UNIXTIME().

Even if those unlikely things occur, you can kludge the results next day. So don’t worry. My colleague Dmitri Lenev points out: there are leap -second geeks who are configuring their systems to have non-posix unix time/zoneinfo with leap seconds included and only they are affected.

The long answer, for the people who insist on worrying anyway or who are just curious about leap seconds in SQL and/or MySQL, is the rest of this article.

GENERAL INFORMATION

If you haven’t heard about leap seconds before, try these articles first:

http://www.ucolick.org/~sla/leapsecs/HTMLutcdoomed.html
(My favourite line from this is: the rotating earth is decelerating at a rate of 1.7 milliseconds per day per century, but not in a regular way.

http://www.airspacemag.com/flight-today/One-More-Second.html
(My favourite line from this is: leap seconds are affected by “thinning of glaciers caused by global warming”, among other things.)

http://kbase.redhat.com/faq/docs/DOC-15153
(My favourite line from this is: Red Hat handles leap seconds by making ‘23:59:59′ twice as long, which is inspiring. But I don’t have information about every Linux installation. Ask your OS vendor.)

My colleague Magne Mæhre points out: ” mostly it’s due to a sloppy estimation of the natural frequency of the caesium atom. When the SI second was introduced, it was defined to be the duration of 9 192 631 770 cycles of the radiation (…) of the Cs-133 atom. The problem is that this has nothing to do with the rotation of the earth :) If the second had been defined as 9 192 631 997 periods instead, we would only have had three leap seconds (2 positive and 1 negative) since 1972, instead of the 22 (all positive) we’ve had.”

LOOK AT YOUR MYSQL INSTALLATION

For some platforms, MySQL supplies “Time zone description tables” http://dev.mysql.com/downloads/timezones.html

If you didn’t upload a time-zone and leap-second table, then you’re looking at a case for which there’s no reported problem.

To find out what’s in your leap-second table, say:
select transition_time,from_unixtime(transition_time),correction
from mysql.time_zone_leap_second;
If the result looks like this:

+-----------------+--------------------------------+------------+
| transition_time | from_unixtime(transition_time) | correction |
+-----------------+--------------------------------+------------+
|        78796800 | 1972-06-30 18:00:00            |          1 |
|        94694401 | 1972-12-31 17:00:01            |          2 |
...
|       867715220 | 1997-06-30 18:00:20            |         21 |
|       915148821 | 1998-12-31 17:00:21            |         22 |
|      1136073622 | 2005-12-31 17:00:22            |         23 |
+-----------------+--------------------------------+------------+
23 rows in set (0.00 sec)

then your table has not been updated for the leap second.

SUPPORT OF LEAP SECONDS BY OTHER DBMS VENDORS

Oracle won’t accept a timestamp literal with a leap second. I tried
insert into t values (timestamp ‘2005-12-31 23:59:60′);
The result was “ORA-1852: seconds must be between 0 and 59″. That’s not an abnormal limitation. The SQL Server 2008 manual says:
“Using hour 24 to represent midnight and leap second over 59 as defined by ISO 8601 (5.3.2 and 5.3) are not supported to be backward compatible and consistent with the existing date and time types. They are not defined by SQL standard 2003.”
http://msdn.microsoft.com/en-us/library/bb677243.aspx
I only found one vendor document which clearly claims support for leap seconds: for Informix:
http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.docnotes.doc/xc8w4/ids_sqls_docnotes_10.0.html

STANDARD SQL

Microsoft isn’t telling the whole story when it says “not defined by SQL standard 2003″. The SQL standard doesn’t define, but it allows:
“… sometimes, but very rarely, a particular minute will contain exactly 59, 61, or 62 seconds. Whether an SQL-implementation supports leap seconds, and the consequences of such support for date and interval arithmetic, is implementation-defined.”
In another place, it says that valid values for the SECOND field in a datetime are between 00 and 61.9. It’s unfortunately true that when talking about casting to INTERVAL targets the standard says
“according to the natural rules for intervals as defined in the Gregorian calendar (that is, there are 60 seconds in a minute”.
Well, Pope Gregory XIII would be surprised to hear that he said that, but that’s only about normalizing interval literals, we can ignore it.

TIME ZONE AGNOSTICISM

According to Wikipedia. Leap seconds occur “simultaneously worldwide; for example, the leap second on December 31, 2005 occurred at 23:59:60 UTC. This was 6:59:60 p.m. U.S. Eastern Standard Time and 0:59:60 a.m. on January 1, 2006 Central European Time.” Therefore MySQL might not always say that ‘22:59:60′ is invalid.
But these things are still invalid:
* (datetime/timestamp < ‘2009-05-31′
and (month,day) not between ‘12-31 11:00:00′ and 01-01 13:00′)
and (month,day) not between ‘05-31 11:00:00′ and 06-01 13:00′))
or datetime/timestamp < ‘1971-12-31′
* datetime/timestamp < ‘2009-05-31′
and second > 60 /* because there has never been a 2-second jump */
* MINUTES not in (44,59,14,29)
and second > 59 /* because zone offsets like ‘12 minutes’ don’t occur */
* datetime/timestamp in UTC doesn’t end in 23:59:60′
and second > 59 /* if user actually specifies a time zone */
None of the above checks depend on actual access to time zone tables. Therefore, getting invalid data in is hard, except via NOW() or CURRENT_TIMESTAMP at the aforementioned moment.

INTERVAL ARITHMETIC

Currently I get this:

mysql> select timestampdiff(second,'2009-01-01 00:00:00','2008-12-31 23:59:59');
+-------------------------------------------------------------------+
| timestampdiff(second,'2009-01-01 00:00:00','2008-12-31 23:59:59') |
+-------------------------------------------------------------------+
|                                                                -1 |
+-------------------------------------------------------------------+
1 row in set (0.00 sec)

Some might say that it should be -2, or at least Bug#40610 “Date and time functions like DATE_SUB(), DATE_ADD() are not leap second enabled.” said o, because -1 is “not what people expect”.

I disagree for these reasons:
* I’m not so sure “what people expect”.
* The effect will be: if you have an up-to-date leap seconds table,
you’ll get -2, but otherwise you’ll get -1. This could hurt replication
more than it helps.

Therefore I claimed that we should dismiss bug#40610 as an unreasonable feature request, ignore leap seconds in all interval calculations, and say that ‘2008-12-31 23:59:60′ = ‘2008-12-31 23:59:59′. It turned out there’s an even easier solution, though.

MAKETIME

Currently I get this:

mysql> select maketime(23,59,60);
+--------------------+
| maketime(23,59,60) |
+--------------------+
| NULL               |
+--------------------+
1 row in set (0.00 sec)

In other words, MAKETIME treats ‘23:59:60′ as invalid.

STR_TO_DATE

Currently I get this:

mysql> SELECT STR_TO_DATE('2008-12-31 23:59:60', '%Y-%m-%d %H:%i:%S');
+---------------------------------------------------------+
| STR_TO_DATE('2008-12-31 23:59:60', '%Y-%m-%d %H:%i:%S') |
+---------------------------------------------------------+
| NULL                                                    |
+---------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)

In other words, STR_TO_DATE treats ‘23:59:60′ as invalid.

UNIX_TIMESTAMP

Currently I get this:

mysql> select unix_timestamp('2008-12-31 23:59:60');
+---------------------------------------+
| unix_timestamp('2008-12-31 23:59:60') |
+---------------------------------------+
|                                     0 |
+---------------------------------------+
1 row in set, 1 warning (0.00 sec)

I think we could have accepted that unix_timestamp(’2008-12-31 23:59:60′) would return different values on different systems. If there is a leap second table available, then the response could be the ‘:60′ value. If there is no leap second table available, then the response should not be 0, it should be the same as for 2008-12-31 23:59:59. in my opinion.
But it’s not. Oh, well.

UPDATING MYSQL.TIME_ZONE_LEAP_SECOND TABLE

For some platforms, MySQL supplies “Time zone description tables”
http://dev.mysql.com/downloads/timezones.html
Probably, at time of reading, your copy is not up to date! 2008-12-31 is not there! Also I expect that operating-system vendors’ tables are not up to date, or else that users haven’t updated lately. A similar thing happened with the time-zone tables last year, when some countries decided to change the dates that daylight saving time begins and ends.
I’d say it’s up to each user to check for update bulletins every six months from http://hpiers.obspm.fr/iers/bul/bulc/.

CONNECTORS

I am reasonably confident that there would be no problems moving ‘2008-12-31 23:59:60′ back and forth from C and Java programs, although I suppose somebody should ask if there is any JDBC problem retrieving completely invalid values like ‘2008-12-01 23:45:61′. And I have noticed that some common C structures e.g. in time.h
http://www.uwm.edu/cgi-bin/IMT/wwwman?topic=ctime(3)&msection= say that the maximum number of seconds is 60, not 61. MySQL should be rejecting such things, but I can’t be sure of every possible case.

STORAGE ENGINES

We know of no problems with our own storage engines. In theory there could be a storage engine which stores dates and datetimes in a compressed form that doesn’t allow for 62 seconds per minute.

EVENTS

An event created with
create event e on schedule every 1 second starts ‘2008-12-31 23:59:59′ do set @a=5;
should happen at 23:59:59 but not at 23:59:60.

A GENERAL OBSERVATION

My colleague Georgi Kodinov points out: in MySQL date arithmetic in general (when using DATE/TIME/DATETIME columns) is done according to our internal understanding of the UTC time zone (see Bug#40329 “TIMESTAMPADD does not take in to account Daylight Savings Time”). If you need time-zone dependent calculations you’ll need to convert to local time, do the math and then convert back it seems.

THE FIX

As a fix, we’re supplying a patched version of the MySQL server. Its NOW() function will return 23:59:59 twice on December 31. That’s not exactly the right answer, but it sure saves a lot of trouble.

You can get this fix as a source code patch now, just look at the comments for Bug#39920 “MySQL cannot deal with Leap Second expression in string literal.” I’m hoping that MySQL will put this in binaries soon, but I don’t guarantee such things.

The alternative is to do an “after the fact” patch when you head into the office on January 1 2009. You can find and get rid of the value in affected columns thus:
UPDATE affected_table
SET affected_column = CONCAT(LEFT(affected_column,LENGTH(affected_column)-2),’59′)
WHERE RIGHT(affected_column,2) = ‘60′;

REFERENCES

Bug#6387 Queried timestamp values do not match the inserted value (closed in 2004)
Bug#18883 Error fetching TIME columns > 23:59:59
Bug#39920 MySQL cannot deal with Leap Second expression in string literal.
Bug#40610 Date and time functions like DATE_SUB(), DATE_ADD() are not leap second enabled.
“RE: leap second” http://lists.mysql.com/internals/36100

Posted in News | No Comments »

“Show profile” + “Information_schema.profiling”

November 6th, 2008

Starting with version 5.1.28, there is a new table INFORMATION_SCHEMA.PROFILING and a new statement SHOW PROFILE(S). Basic information is in the reference manual and in an introductory article. But I don’t think that’s adequate for everyone. I’ll try here to give some non-basic information. I’ll assume that you read the manual, think of this as a supplement.

A bit of a caveat: I was totally against adding this feature, and some of my bias might show. But I’m not trying to argue about it. I do believe that once we add a feature we should explain it, and all I’m trying to do is explain here.

How it works

The data-gathering code, or “instrumentation”, goes approximately like this:
statement_begin:
start-and-setup-for-getting-info
Stage 1: collect-data()
bunch-of-code
Stage 2: collect-data()
bunch-of-code
Stage 3: collect-data()
…
end-and-put-info-in-buffer

And specifically the timing part of the collect-data() code looks like this:
In sql/sql_profile.cc:
”
void PROF_MEASUREMENT::collect()
{
time_usecs= (double) my_getsystime() / 10.0; /* 1 sec was 1e7, now is 1e6 */
#ifdef HAVE_GETRUSAGE
getrusage(RUSAGE_SELF, &rusage);
#endif
}
”

The my_getsystime() function gets time of day; the getrusage() function gets a little more; I’ll get into the details about ‘timing’ later.

So what’s happening is: if profiling is on (@@profiling==1), then, at each “stage” of processing for an SQL statement, MySQL gets the current time with a get-time-of-day function, and maybe gets a little more than that with getrusage().

This information is stored in memory. When you select from the PROFILING table, MySQL gets basic data about the query from elsewhere (things like “Query Id” or “Seq” are just counters and things like “function name” are straightforward), combines it with the timing information, and Bob’s your uncle, you have a display like this:

mysql> select query_id,seq,state,duration,source_function
       from information_schema.profiling where query_id = 6;
+----------+-----+--------------------+----------+-----------------------+
| query_id | seq | state              | duration | source_function       |
+----------+-----+--------------------+----------+-----------------------+
|        6 |   1 | starting           | 0.000565 | NULL                  |
|        6 |   2 | Opening tables     | 0.001148 | open_tables           |
|        6 |   3 | System lock        | 0.000137 | mysql_lock_tables     |
|        6 |   4 | Table lock         | 0.000222 | mysql_lock_tables     |
|        6 |   5 | init               | 0.000209 | mysql_select          |
|        6 |   6 | optimizing         | 0.027357 | optimize              |
|        6 |   7 | statistics         | 0.000398 | optimize              |
|        6 |   8 | preparing          | 0.000211 | optimize              |
|        6 |   9 | executing          | 0.001443 | exec                  |
|        6 |  10 | Sending data       | 0.000542 | exec                  |
|        6 |  11 | end                | 0.000213 | mysql_select          |
|        6 |  12 | query end          | 0.000143 | mysql_execute_command |
|        6 |  13 | freeing items      | 0.001860 | mysql_parse           |
|        6 |  14 | removing tmp table | 0.000900 | free_tmp_table        |
|        6 |  15 | closing tables     | 0.000142 | free_tmp_table        |
|        6 |  16 | logging slow query | 0.000139 | log_slow_statement    |
|        6 |  17 | cleaning up        | 0.000582 | dispatch_command      |
+----------+-----+--------------------+----------+-----------------------+
17 rows in set (0.00 sec)

There will be rows for the last 15 queries (15 is the default if you just “set @@profiling=1″) or perhaps as many as 100 (100 is the maximum value if you say “set @@profiling_history_size = 100″).

I’ll now explain what each of the above columns (query_id, seq, state, duration, source_function) is supposed to mean.

query_id

Whenever you enter a new statement, your “query identifier” goes up by 1. (MySQL people sometimes use the word “query” when the meaning is “statement”.)

This is a number within your session, not a global number, so each connection’s query_id value starts with 1.

SEQ

This is the “stage number” or “step number”. Each time the program reaches a new stage, this number goes up by 1.

state

By now the question might be: when does a new stage happen? MySQL has always (or at least, since before there was a “profiling” patch) had a function thd_set_proc_info() which it calls at different stages. The profiling feature is simply adding the timing functions to this thd_set_proc_info() and making it available via SHOW and SELECT. This piggybacking made the feature easier to add, but it means the stages aren’t always meaningful because they weren’t originally designed with query profiling in mind.

By using grep on the source code, I have been able to collect a list of the server functions which invoke get_thd_proc_info().

sql/item_func.cc:3393: thd_proc_info(thd, “User lock”);
sql/item_func.cc:3418: thd_proc_info(thd, 0);
sql/item_func.cc:3509: thd_proc_info(thd, “User lock”);
sql/item_func.cc:3552: thd_proc_info(thd, 0);
sql/item_func.cc:3737: thd_proc_info(thd, “User sleep”);
sql/item_func.cc:3749: thd_proc_info(thd, 0);
sql/log_event.cc:5926: thd_proc_info(thd, proc_info);
sql/log_event.cc:5979: thd_proc_info(thd, 0);
sql/log_event.cc:6101: thd_proc_info(thd, proc_info);
sql/log_event.cc:6135: thd_proc_info(thd, 0);
sql/sql_delete.cc:62: thd_proc_info(thd, “init”);
sql/sql_delete.cc:253: thd_proc_info(thd, “updating”);
sql/sql_delete.cc:336: thd_proc_info(thd, “end”);
sql/sql_delete.cc:582: thd_proc_info(thd, “deleting from main table”);
sql/sql_delete.cc:897: thd_proc_info(thd, “deleting from reference tables”);
sql/sql_delete.cc:906: thd_proc_info(thd, “end”);
sql/sql_class.cc:264:const char *set_thd_proc_info(THD *thd, const char *info,
sql/sp_head.cc:2734: thd_proc_info(thd, “closing tables”);
sql/sp_head.cc:2737: thd_proc_info(thd, 0);
sql/lock.cc:258: thd_proc_info(thd, “System lock”);
sql/lock.cc:269: thd_proc_info(thd, “Table lock”);
sql/lock.cc:317: thd_proc_info(thd, 0);
sql/lock.cc:332: thd_proc_info(thd, 0);
sql/sql_base.cc:940: thd_proc_info(thd, “Flushing tables”);
sql/sql_base.cc:1008: thd_proc_info(thd, 0);
sql/sql_base.cc:2140: thd_proc_info(thd, “Waiting for table”);
sql/sql_base.cc:2160: thd_proc_info(thd, proc_info);
sql/sql_base.cc:3532: thd_proc_info(thd, “Waiting for tables”);
sql/sql_base.cc:3548: thd_proc_info(thd, “Reopen tables”);
sql/sql_base.cc:3553: thd_proc_info(thd, 0);
sql/sql_base.cc:4450: thd_proc_info(thd, “Opening tables”);
sql/sql_base.cc:4729: thd_proc_info(thd, 0);
sql/sql_base.cc:4878: thd_proc_info(thd, “Opening table”);
sql/sql_base.cc:4918: thd_proc_info(thd, 0);
sql/sql_base.cc:8511: thd_proc_info(thd, “FULLTEXT initialization”);
sql/sql_view.cc:669: thd_proc_info(thd, “end”);
sql/sql_update.cc:235: thd_proc_info(thd, “init”);
sql/sql_update.cc:464: thd_proc_info(thd, “Searching rows for update”);
sql/sql_update.cc:533: thd_proc_info(thd, “Updating”);
sql/sql_update.cc:766: thd_proc_info(thd, “end”);
sql/sql_update.cc:1260: thd_proc_info(thd, “updating main table”);
sql/sql_update.cc:1974: thd_proc_info(thd, “updating reference tables”);
sql/sql_update.cc:1986: thd_proc_info(thd, “end”);
sql/sql_insert.cc:618: thd_proc_info(thd, “init”);
sql/sql_insert.cc:703: thd_proc_info(thd, “update”);
sql/sql_insert.cc:923: thd_proc_info(thd, “end”);
sql/sql_insert.cc:1796: thd_proc_info(thd, “waiting for delay_list”);
sql/sql_insert.cc:1878: thd_proc_info(thd, “Creating delayed handler”);
sql/sql_insert.cc:1924: thd_proc_info(thd, “waiting for handler open”);
sql/sql_insert.cc:1930: thd_proc_info(thd, “got old table”);
sql/sql_insert.cc:2007: thd_proc_info(client_thd, “waiting for handler lock”);
sql/sql_insert.cc:2013: thd_proc_info(client_thd, “got handler lock”);
sql/sql_insert.cc:2031: thd_proc_info(client_thd, “allocating local table”);
sql/sql_insert.cc:2116: thd_proc_info(thd, “waiting for handler insert”);
sql/sql_insert.cc:2120: thd_proc_info(thd, “storing row into queue”);
sql/sql_insert.cc:2366: thd_proc_info(&(di->thd), “Waiting for INSERT”);
sql/sql_insert.cc:2401: thd_proc_info(&(di->thd), 0);
sql/sql_insert.cc:2543: thd_proc_info(&thd, “upgrading lock”);
sql/sql_insert.cc:2556: thd_proc_info(&thd, “insert”);
sql/sql_insert.cc:2694: thd_proc_info(&thd, “reschedule”);
sql/sql_insert.cc:2715: thd_proc_info(&thd, “insert”);
sql/sql_insert.cc:2721: thd_proc_info(&thd, 0);
sql/slave.cc:1054: thd_proc_info(thd, “Creating table from master dump”);
sql/slave.cc:1069: thd_proc_info(thd, “Opening master dump table”);
sql/slave.cc:1085: thd_proc_info(thd, “Reading master dump table data”);
sql/slave.cc:1097: thd_proc_info(thd, “Rebuilding the index on master dump table”);
sql/slave.cc:1586: thd_proc_info(thd, “Waiting for the next event in relay log”);
sql/slave.cc:1588: thd_proc_info(thd, “Waiting for master update”);
sql/slave.cc:2311: thd_proc_info(thd, “Connecting to master”);
sql/slave.cc:2338: thd_proc_info(thd, “Checking master version”);
sql/slave.cc:2347: thd_proc_info(thd, “Registering slave on master”);
sql/slave.cc:2377: thd_proc_info(thd, “Requesting binlog dump”);
sql/slave.cc:2408: thd_proc_info(thd, “Waiting for master to send event”);
sql/slave.cc:2452: thd_proc_info(thd, “Queueing master event to the relay log”);
sql/slave.cc:2527: thd_proc_info(thd, “Waiting for slave mutex on exit”);
sql/slave.cc:2714: thd_proc_info(thd, “Reading event from the relay log”);
sql/slave.cc:2801: thd_proc_info(thd, “Waiting for slave mutex on exit”);
sql/sql_cache.cc:760: thd_proc_info(thd, “storing result in query cache”);
sql/sql_cache.cc:813: thd_proc_info(thd, “storing result in query cache”);
sql/sql_cache.cc:1262: thd_proc_info(thd, “checking query cache for query”);
sql/sql_cache.cc:1346: thd_proc_info(thd, “checking privileges on cached query”);
sql/sql_cache.cc:1441: thd_proc_info(thd, “sending cached result to client”);
sql/sql_cache.cc:1518: thd_proc_info(thd, “invalidating query cache entries (table list)”);
sql/sql_cache.cc:1544: thd_proc_info(thd, “invalidating query cache entries (table)”);
sql/sql_class.h:1298: Set it using the thd_proc_info(THD *thread, const char *message)
sql/sql_parse.cc:386: thd_proc_info(thd, “Execution of init_command”);
sql/sql_parse.cc:447: thd_proc_info(thd, 0);
sql/sql_parse.cc:1575: thd_proc_info(thd, “cleaning up”);
sql/sql_parse.cc:1577: thd_proc_info(thd, 0);
sql/sql_parse.cc:1608: thd_proc_info(thd, “logging slow query”);
sql/sql_parse.cc:1618: thd_proc_info(thd, “logging slow query”);
sql/sql_parse.cc:3220: thd_proc_info(thd, “init”);
sql/sql_parse.cc:4799: thd_proc_info(thd, “query end”);
sql/sql_parse.cc:5037: thd_proc_info(thd, “checking permissions”);
sql/sql_parse.cc:5805: thd_proc_info(thd, “freeing items”);
sql/repl_failsafe.cc:104: thd_proc_info(thd, “Thread initialized”);
sql/repl_failsafe.cc:619: thd_proc_info(thd, “Processing request”);
sql/repl_failsafe.cc:974: thd_proc_info(thd, “purging old relay logs”);
sql/repl_failsafe.cc:1001: thd_proc_info(thd, “starting slave”);
sql/repl_failsafe.cc:1013: thd_proc_info(thd, 0);
sql/sql_repl.cc:658: thd_proc_info(thd, “Sending binlog event to slave”);
sql/sql_repl.cc:696: thd_proc_info(thd, “Finished reading one binlog; switching to next binlog”);
sql/sql_repl.cc:742: thd_proc_info(thd, “Waiting to finalize termination”);
sql/sql_repl.cc:749: thd_proc_info(thd, “Waiting to finalize termination”);
sql/sql_repl.cc:903: thd_proc_info(thd, “Killing slave”);
sql/sql_repl.cc:930: thd_proc_info(thd, 0);
sql/sql_repl.cc:1089: thd_proc_info(thd, “Changing master”);
sql/sql_repl.cc:1220: thd_proc_info(thd, “Purging old relay logs”);
sql/sql_repl.cc:1281: thd_proc_info(thd, 0);
sql/sql_select.cc:762: thd_proc_info(thd, “optimizing”);
sql/sql_select.cc:944: thd_proc_info(thd, “statistics”);
sql/sql_select.cc:954: thd_proc_info(thd, “preparing”);
sql/sql_select.cc:1408: thd_proc_info(thd, “Creating tmp table”);
sql/sql_select.cc:1458: thd_proc_info(thd, “Sorting for group”);
sql/sql_select.cc:1479: thd_proc_info(thd, “Sorting for order”);
sql/sql_select.cc:1638: thd_proc_info(thd, “executing”);
sql/sql_select.cc:1777: thd_proc_info(thd, “Copying to tmp table”);
sql/sql_select.cc:1904: thd_proc_info(thd, “Creating sort index”);
sql/sql_select.cc:1918: thd_proc_info(thd, “Copying to group table”);
sql/sql_select.cc:1977: thd_proc_info(thd, “Removing duplicates”);
sql/sql_select.cc:2040: thd_proc_info(thd, “Sorting result”);
sql/sql_select.cc:2177: thd_proc_info(thd, “Sending data”);
sql/sql_select.cc:2337: thd_proc_info(thd, “init”);
sql/sql_select.cc:2382: thd_proc_info(thd, “end”);
sql/sql_select.cc:10527: thd_proc_info(thd, “removing tmp table”);
sql/sql_select.cc:10549: thd_proc_info(thd, save_proc_info);
sql/sql_select.cc:10583: thd_proc_info(thd, “converting HEAP to MyISAM”);
sql/sql_select.cc:10647: thd_proc_info(thd, (!strcmp(save_proc_info,”Copying to tmp table”) ?
sql/sql_select.cc:10660: thd_proc_info(thd, save_proc_info);
sql/sql_table.cc:3517: thd_proc_info(thd, “creating table”);
sql/sql_table.cc:3586: thd_proc_info(thd, “After create”);
sql/sql_table.cc:5096: thd_proc_info(thd, “discard_or_import_tablespace”);
sql/sql_table.cc:5113: thd_proc_info(thd, “end”);
sql/sql_table.cc:6068: thd_proc_info(thd, “init”);
sql/sql_table.cc:6271: thd_proc_info(thd, “setup”);
sql/sql_table.cc:6327: thd_proc_info(thd, “rename”);
sql/sql_table.cc:6724: thd_proc_info(thd, “copy to tmp table”);
sql/sql_table.cc:6898: thd_proc_info(thd, “rename result table”);
sql/sql_table.cc:7013: thd_proc_info(thd, “end”);
sql/mysql_priv.h:619:#define thd_proc_info(thd, msg) set_thd_proc_info(thd, msg, __func__, __FILE__, __LINE__)
sql/mysql_priv.h:681: The meat of thd_proc_info(THD*, char*), a macro that packs the last
sql/mysql_priv.h:685:const char *set_thd_proc_info(THD *thd, const char *info,
sql/mysql_priv.h.pp:1694:const char *thd_proc_info(void* thd, const char *info);
sql/mysql_priv.h.pp:2394:const char *set_thd_proc_info(THD *thd, const char *info,
storage/csv/ha_tina.cc:1552: old_proc_info= thd_proc_info(thd, “Checking table”);
storage/csv/ha_tina.cc:1582: thd_proc_info(thd, old_proc_info);
storage/myisam/ha_myisam.cc:752: thd_proc_info(thd, “Checking table”);
storage/myisam/ha_myisam.cc:826: thd_proc_info(thd, old_proc_info);
storage/myisam/ha_myisam.cc:1112: thd_proc_info(thd, buf);
storage/myisam/ha_myisam.cc:1115: thd_proc_info(thd, “Repair done”); // to reset proc_info, as
storage/myisam/ha_myisam.cc:1120: thd_proc_info(thd, “Repair by sorting”);
storage/myisam/ha_myisam.cc:1127: thd_proc_info(thd, “Repair with keycache”);
storage/myisam/ha_myisam.cc:1141: thd_proc_info(thd, “Sorting index”);
storage/myisam/ha_myisam.cc:1149: thd_proc_info(thd, “Analyzing”);
storage/myisam/ha_myisam.cc:1156: thd_proc_info(thd, “Saving state”);
storage/myisam/ha_myisam.cc:1194: thd_proc_info(thd, old_proc_info);
storage/myisam/ha_myisam.cc:1409: thd_proc_info(thd, “Creating index”);
storage/myisam/ha_myisam.cc:1434: thd_proc_info(thd, save_proc_info);
storage/archive/ha_archive.cc:1553: old_proc_info= thd_proc_info(thd, “Checking table”);
storage/archive/ha_archive.cc:1568: thd_proc_info(thd, old_proc_info);

So there you go, now you know what the “state” values are, except for a few like “Resuming”. And for more information, all you have to do is read the above source files. Of course, the line numbers will change frequently and the calls to set_thd_proc_info() will change occasionally.

duration

The my_getsystime() function in sql/my_getsystime.c gets time of day with one of:
clock_gettime() available on many Unix systems
QueryPerformanceCounter() Windows only
NXGetTime() NetWare only
gettimeofday() the default if there’s nothing else
All these time-of-day functions are slow on most platforms, but they only get called a few times per statement, because a typical statement has ten or fewer “Stages”. And they’re all good for microsecond measurements.

To get “Duration”, MySQL just subtracts time-of-day [for the previous stage] from time-of-day [for the current stage].

The first (”starting”) row has the duration since the statement execution started, so it doesn’t include some network-IO overhead.

source_function

I tend to think that “state” is more explanatory than “source_function”, but if you really need to get an exact idea about the stage that’s been reached, source_function and source_line are the things to look at. The reference is of course to the source code at build time, so if you pull and build at different times, or depend on a binary distribution, you might have a bit of trouble with these values.

The columns that come from getrusage()

Remember that the lines for ‘timing’ invoked a get-time-of-day function, and then this:
#ifdef HAVE_GETRUSAGE
getrusage(RUSAGE_SELF, &rusage);
#endif
which means “call the operating system’s getrusage() function f it’s there”.

The getrusage() function is available on most operating systems but the MySQL names are all different from the getrusage names. Here is a handy chart:

MYSQL COLUMN NAME               GETRUSAGE FIELD NAME
CPU_USER decimal(9,6)           ru_utime
CPU_SYSTEM decimal(9,6)         ru_stime
CONTEXT_VOLUNTARY               ru_nvcsw
CONTEXT_INVOLUNTARY int(20)     ru_nivcsw
BLOCK_OPS_IN int(20)            ru_inblock
BLOCK_OPS_OUT int(20)           ru_oublock
MESSAGES_SENT int(20)           ru_msgend
MESSAGES_RECEIVED int(20)       ru_msgrcv
PAGE_FAULTS_MAJOR int(20)       ru_majflt
PAGE_FAULTS_MINOR int(20)       ru_minflt
SWAPS int(20)                   ru_nswap

The possibly interesting values are PAGE_FAULTS_MAJOR (which tells you if the operating system had to do a disk read because it didn’t have a copy of the disk block in its memory), CONTEXT_VOLUNTARY (which may help you guess the number of “waits” i.e. how many times a thread had to wait for something), and BLOCK_OPS_IN + BLOCK_OPS_OUT (which give a good hint about how active the disk is).

Some things may need emphasis for a MySQL context.

1 It does not work on Windows!
There is a Windows-API function to get the number of page faults and some memory information: GetProcessMemoryInfo (http://msdn.microsoft.com/en-us/library/ms683219(VS.85).aspx). And GetProcessTimes might give something like ru_utime + ru_stime. So we could get at least some of the getrusage() data. But we don’t.

2 It’s per process not per thread!
Since MySQL tends to use a thread-per-connection model, it might be most interesting to know “how long did this process take for this thread” rather than “how long did it take for the whole process (which includes other threads)”. This way, if some other job interrupts, you’ll see the other job’s time rather than your own. Linux is moving towards per-thread accounting (my Brazilian colleague informed me it’s already working fine with the 2.6.25 kernel), but on a busy system your times may be misleading.

3 It gives estimates that are useless for short measurements!
Usually getrusage() is getting information that the operating system deposited the last time there was a ‘timer tick’, which could be as much as 1/100 seconds ago. Some operating systems tick 1000 times per second instead of 100 times per second, and Solaris has “microstate accounting” which means it’s precise. But as a general rule you can’t trust getrusage() results for high precision.

All of these observations are okay, given the purpose of the profile feature. But I thought you should know about them.

The future

If we decide to remove a feature, we always give plenty of warning with a “deprecation notice”. At time of writing there is no deprecation notice for the profiling feature.

Posted in MySQL 5.1 Features | 3 Comments »

Online Backup: Encryption

October 12th, 2008

By Peter Gulutzan

A new addition inside the new “MySQL Online Backup” feature is encryption. Australian Senior Software Engineer Stewart Smith has already mentioned it a few months ago in his blog. There’s a preview that you can download on the online backup pages. It’s probably going to get architecture-reviewer approval next week. Probably it will be part of MySQL-6.0 source downloads soon. I extract some paragraphs from what I wrote in a non-public worklog task, WL#4271.

For BACKUP:
BACKUP DATABASE database_name TO ‘image-file-name’
ENCRYPTION_ALGORITHM = { AES | 3DES }
[ ENCRYPTION_KEYSIZE = { 128 | 192 | 256 } ]
PASSWORD = ‘password’;

For RESTORE:
RESTORE FROM ‘image-file-name’
ENCRYPTION_ALGORITHM = { AES | 3DES }
[ ENCRYPTION_KEYSIZE = { 128 | 192 | 256 } ]
PASSWORD = ‘password’;

According to WL#4037 “Online backup: Use zlib compression to reduce backup file size”, there might be compression clauses, for example “WITH COMPRESSION COMPRESSION_ALGORITHM = gzip”. The order does not matter, that is, this worklog task does not attempt to specify the relative positions of the compression and encryption clauses. But in any case compression occurs before encryption.

If ENCRYPTION_ALGORITHM = AES then ENCRYPTION_KEYSIZE may be specified, or it may be omitted (default = 128). If ENCRYPTION_ALGORITHM = 3DES then ENCRYPTION_KEYSIZE may not be specified.

The minimum length of password is zero. The maximum length of password is implementor-defined.

There are no new privilege requirements.

Example:

BACKUP DATABASE d TO ‘1′ ENCRYPTION_ALGORITHM=AES ENCRYPTION_KEYSIZE=128
PASSWORD=”;
RESTORE FROM ‘1′ ENCRYPTION_ALGORITHM=3DES PASSWORD=”;

Effect

MySQL encrypts with the specified algorithm and password, for example something analogous to AES_ENCRYPT(,’password’);
MySQL decrypts with the specified algorithm and password, for example something analogous to AES_DECRYPT(,’password’);

Therefore the RESTORE statement will fail if the algorithm and password differ in any way from what was used in BACKUP. For example, if backup was done with BACKUP DATABASE d TO ‘1′
ENCRYPTION_ALGORITHM=AES ENCRYPTION_KEYSIZE=256 PASSWORD=”, these statements will fail:
RESTORE FROM ‘1′;
RESTORE FROM ‘1′ ENCRYPTION_ALGORITHM=AES PASSWORD=”;
RESTORE FROM ‘1′ ENCRYPTION_ALGORITHM=AES ENCRYPTION_KEYSIZE=256 PASSWORD=’a';

Encryption is of the whole image, not per-database.

Assumptions

Assume password interception won’t occur, because there is some protection from an SSL connection or due to a later worklog task, WL#1054 “Pluggable authentication support”.

Assume logs which would contain passwords are stored in secure places. The PASSWORD clause is logged, replicated, and apparent in columns like mysql.online_backup.command — but protected by privileges on such metadata.

Assume PROCESS privilege is rare so don’t worry that SHOW PROCESSLIST exposes the IDENTIFIED BY clause.

Assume AES and 3DES are all we need for now.

New Reserved words

None.

Encryption Algorithms

The encryption algorithms are in the yassl library. If mysqld is built without the yassl library, then the encryption/decryption algorithms won’t work, at least for now.

Observations on the preview

There are a few anomalies that might be bugs. Generally I don’t report bugs to bugs.mysql.com until/unless they’re in a main tree.

1. This statement causes a crash:
restore from ‘5.bak’ password=’pants’;

2. This statement causes no error although keysize should not be 0 or have a default:
backup database test to ‘5.bak’ encryption_algorithm=aes encryption_keysize=0 password=”The Rain In Spain”;

3. This statement causes a crash:
backup database test to ‘4.bak’ encryption_keysize=128;

4. This series of statements causes a crash:
use test
drop table if exists t;
create table t (s1 varchar(1000)) engine=falcon;
insert into t values (0×1);
set names utf8;
backup database test to ‘1.bak’ encryption_algorithm=aes password=’Γ’;
/* The Γ character is GREEK CAPITAL LETTER GAMMA U+0393 */

5. Nothing happens to backup_history or backup_progress.

6. This statement causes no error although AES should not be inside “s:
BACKUP DATABASE test TO ‘8.bak’ ENCRYPTION_ALGORITHM=`aes` PASSWORD=”;

Posted in MySQL 6.0 New Features | 2 Comments »

Back from MySQL Developer Meeting in Latvia

October 12th, 2008

Last month I went to the Sun Microsystems / MySQL Developer Conference in Riga Latvia. Along with about 250 other people. The majority were MySQL developers, but as usual we had several attendees from MySQL support and administration, and a few non-MySQL people came to talk about Google, Drizzle, and PBXT. This was one of our annual “internal” conferences, so lots of stuff that we talked about in Riga will only be publicly announced around, say, the time of the next MySQL User Conference . I’ll just talk about the stuff that’s already leaked, and about my own involvements.

First let me give my impression of ‘morale’. It’s up, way up, since two years ago. I attribute this to two causes: removal of a problem, and a better team.

Huh? Haven’t I read non-MySQLers’ blogs about people leaving the team? Sure, and I mumble to myself ‘get a grip, get a sense of proportion’ when I read them. All organizations have comings and goings. Some of the departures are exaggerated; for example Jim Starkey hasn’t totally left us. Others won’t affect development because the departers haven’t been contributing to the code base significantly for a long time, they’ve concentrated on evangelism or visionary stuff. More people have joined than have left, and I wish it was obvious how high-quality the new ones are.
* We have several new managers from “Classic Sun” with experience and training as managers. I laud the pure-old-MySQL Team Leads who were running some things before, but they were seeing supervisory or project-administration tasks as burdens, and the ones I’ve talked to seemed glad that their trophies at last they lay down.
* We have other Sun teams helping us out with performance analysis (especially on SPARC or Solaris), and with putting together builds on exotic platforms (MySQL has several million machines available for community testing but having direct access to more machines is a plus too).
* We have new people joining MySQL development teams from other Sun departments. I’ve hesitated to say that the difference so far has been significant, but orientation time for a developer is always several months, and now that we old-guarders have met these ‘newbies’ we’re happy. I lost count of the corridor conversations I overheard that went something like “hey, so-and-so seems to be what we needed for [online backup] [optimizer] [fill-in-the-blank]“.

Other things that contributed to my personal morale were the great hotel facilities, the fact that Georg Richter (Development Manager - Connectors and Client Connectivity Engineering) made sure all the Russians got visas this time (I really missed seeing the whole gang on a previous occasion), and the sound of people actually discussing the work that I think is important. Our idea of having developers working out of home offices worldwide is great, but at least once a year we have to interact with people instead of with IRC chat lines and Skype noises. I’m less isolated than most since I’m not the only MySQLer in Edmonton Canada; however, man is a social animal.

The big theme for all-hands talk was MySQL version 5.1; don’t get the wrong impression from me; I was cheering along with the crowd when our fearless leaders exhorted us to give 5.1 all our waking thoughts. And now I’ll drop that topic since this is a MySQL-6.x blog.

The high point of the conference was a slide show by MySQL 6.x release architect Peter Gulutzan (that’s me) about what’s in the 6.x roadmap. Community Relations Manager Lenz Grimmer took a few pictures, here and here. I’d like to say it was well received, and perhaps most objective observers would say that it was. But our technical discussions do sometimes lead to what diplomats call ‘a frank exchange’. Updating from what’s been said earlier for 6.x: (1) maybe we’re using data types BINARY and VARBINARY excessively and maybe the fixing should happen before I thought it would; (2) specific references about optimizer-related tasks should now be regarded as uncertain; (3) online backup might occur with a different schedule, although this mostly means that things will happen sooner (my next blog post will discuss one of the new online-backup features that we settled in Riga); (4) Sun has ‘globalization requirements’ that we have to meet with 6.1 so that means more character-set and collation tasks for Alexander Barkov (Technical Lead of Localization); (5) there are new committees charged with commenting about new features or overriding decisions made by other committees. I should re-explain here that we often use the term ‘6.x’ for ‘MySQL 6.1, 6.2, or [if necessary] 6.3′ not 6.0.

A historic occasion: I and Senior Software Developer Marc Alff gave the first-time-ever live demo of MySQL’s SIGNAL and RESIGNAL statements, and of a bombshell project that’s so neat we will defer discussion till we know how to describe it better. A picture is here. That’s me in the background demoing to Senior QA Architect Omer BarNir. Marc is the French-looking guy with the rapt audience. One question that I had to face often was “since it works today why will you wait many months to release it?” but I fended that off with the usual that’s-not-my-department shrugging.

Another historic occasion: the “WL#148 (Implement Foreign Keys (all storage engines)” project reached Milestone 7. This project is approximately on schedule. That’s a good thing because soon we’re going to have to say what’s in ‘6.1′ as opposed to the vague ‘6.x’ designation, and I see no reason why WL#148 will be anything but a flagship feature in that release.

Posted in News | No Comments »

Short Puzzles about Names and Handlers

August 10th, 2008

Here are three short routines that raise tough questions. I’ll bet that many MySQL stored-procedures experts will fail to answer all three.

Question 1. The Ambiguous Identifier

Given one table and one stored procedure:
DELIMITER //
CREATE TABLE t (x INT)//
INSERT INTO t VALUES (1)//
CREATE PROCEDURE p ()
BEGIN
DECLARE x INT DEFAULT 2;
SELECT x FROM t;
END//
CALL p()//
Notice that x is both a column and a variable.
MySQL will return
(a) ‘1′ because that’s the value of column x.
(b) ‘2′ because that’s the value of variable x.
(c) an error message because x is ambiguous.
?

Question 2. The same-level handler

Given one table and one stored procedure:
DELIMITER //
SET @@sql_mode=”//
CREATE TABLE t (x SMALLINT)//
CREATE PROCEDURE p ()
BEGIN
DECLARE CONTINUE HANDLER FOR SQLWARNING SELECT z FROM y;
DECLARE CONTINUE HANDLER FOR SQLEXCEPTION INSERT INTO t VALUES (99999);
SELECT y FROM t;
END//
CALL p()//
Notice that t.y does not exist so the SELECT fails and the SQLEXCEPTION handler catches it.
MySQL will return:
(a) A warning, because 99999 is an “out of range” value for a SMALLINT column.
(b) An error, because the SQLWARNING handler will catch the warning for “out of range”.
(c) An infinite loop, because the SQLWARNING handler causes an exception, the SQLEXCEPTION handler catches the exception and causes a warning, the SQLWARNING handler catches the warning and causes an exception, and so ad infinitum.
?

Question 3: The note and the warning

Given no table and the following stored procedure:
DELIMITER //
SET @@sql_mode=”//
DROP TABLE IF EXISTS t//
DROP TABLE IF EXISTS t//
SHOW WARNINGS//
CREATE PROCEDURE p ()
BEGIN
DECLARE EXIT HANDLER FOR SQLWARNING SELECT * FROM t;
DROP TABLE IF EXISTS t;
END//
CALL p()//
Notice that t doesn’t exist, so SHOW WARNINGS will say:
+——-+——+——————-+
| Level | Code | Message |
+——-+——+——————-+
| Note | 1051 | Unknown table ‘t’ |
+——-+——+——————-+
1 row in set (0.00 sec)
MySQL will return:
(a) a warning, because “DROP TABLE IF EXISTS t” within the stored procedure causes an “unkown table ‘t’” warning.
(b) an error, because the SQLWARNING handler catches the warning and selects from t, which does not exist.
?

For fun, take a few moments to answer the questions in your head before continuing.

These questions might puzzle you because in the past they have puzzled us. So we’ve sometimes given mixed signals. After some comparisons with other DBMSs and some hard looks at the SQL standard and (not least!) some hints about what our users really want the answers to be, we’re going to address the puzzling behaviour in MySQL 6.x. Or even earlier, if everyone says we’re looking at bugs not features.

Question 1. The Ambiguous Identifier

The answer is (b). But we think that for MySQL 6.x it should be (c).

Let’s look first at what happens with the other major DBMS products: DB2, Oracle, SQL Server. That means we’re looking at five possibilities because DB2 splits three ways. (I’ll always insist “there is only one DB2″ but this is one of the rare exceptions where the different product lines look less than 100% the same on the outside.)

For DB2 i5/OS 5.4:
“If the tables and views specified in an SQL routine body exist at the time the routine is created, the name will first be checked as a column name. If not found as a column, it will then be checked as an SQL variable name in the compound statement, and then checked as an SQL parameter name.
If the referenced tables or views do not exist at the time the routine is created, the name will first be checked as an SQL variable name in the compound statement and then as an SQL parameter name. … If not found, it will be assumed to be a column.”
http://publib.boulder.ibm.com/infocenter/iadthelp/v7r0/index.jsp?topic=/com.ibm.etools.iseries.langref2.doc/rbafzmst895.htm

For DB2 Version 9.1 for z/OS:
“If an SQL statement contains an SQL variable or parameter and a column reference with the same name, DB2 interprets the name as an SQL variable or parameter name.”
http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=/com.ibm.db29.doc.sqlref/db2z_compoundstatement4externalsqlpl.htm

For DB2 Version 9.1 for Linux, UNIX, and Windows:
“If an SQL statement contains an identifier with the same name as an SQL variable and a column reference, DB2 interprets the identifier as a
column.”
http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/r0004239.htm

For Oracle, I tried the following statements:
create table t (s1 number);
create procedure p as
s1 number;
begin
s1 := 4;
delete from t where s1 = 4;
end;
/

insert into t values (5);
call p();
select * from t;
Result: I got one row.
Therefore Oracle gives precedence to column name.

For SQL Server, variable names must begin with @. Therefore SQL Server doesn’t matter.

For standard SQL, I had a slow and tedious task, merging the clauses of SQL/Foundation 6.6 (the rules for columns and parameters) with the clauses of SQL/PSM 6.2 (the rules for variables). I now have enough confidence to say that this supersedes any nonsense I may have said previously on this topic.
“The statement is a syntax error if x could be any two of: a parameter, a variable, a column.”

For those who have the SQL:2003 standard document handy … The justification of the above statement can be seen in SQL/Foundation 6.6 Syntax Rules, after merging with SQL/PSM 6.2. The key part follows “If N = 1″ (true since there is only one identifier in the chain) and “If the number of possible scope tags … is 1″ (false since there are 2 or 3 possible scope tags), so the phrase to watch is: “Otherwise, each possible scope tag shall be a range variable”. But that’s not the case, and “shall” means (I quote SQL/Framework) “required to be true of syntactically conforming SQL language”. Therefore, syntax error.

(I suppose one could object that there doesn’t have to be a beginning statement label. Technically true, but if there wasn’t an implicit invisible beginning label, simple statements like
CREATE PROCEDURE p () BEGIN DECLARE v INT; END
would be illegal. The oversight will probably be addressed in a Technical Corrigendum for the standard document someday.)

Summary:
DBMS Result of ambiguous identifier
—- ——————————
DB2 i5/OS If table exists: column. Else: variable.
DB2 z/OS Variable.
DB2 Linux/Windows Column.
Oracle Column.
SQL Server Not Applicable.
ANSI/ISO Syntax Error.

Given the frequency of bug reports caused by ambiguous identifiers (Bug#5967 Bug#13771 Bug#16052 Bug#16888 Bug#17878 Bug#20834 Bug#29779 Bug#30163), I think we’d be doing users a favour by causing a syntax error. Often MySQL will accept a something that’s a possible mistake and just give a warning (for example the warning that you get when you try to insert 99999 into a SMALLINT column). In this case, since people who migrate from another DBMS will have the wrong assumption, we should go with the standard. To avoid the syntax error, users will merely have to use “table_name . column_name”

Question 2. The same-level handler

The answer is (b). But we think it shouldn’t be. As one of our engineers (Marc Alff) put it: “The server gives up (without notice) when finding that an exception handler is invoked recursively.” I was a bit worried that somebody would want to claim this is appropriate behaviour, but it didn’t happen. For MySQL 6.x we’ll want to handle this correctly.

A comparison with other DBMSs is unnecessary this time, since Oracle and SQL Server have implemented SQL-procedure syntax which doesn’t offer the same capabilities as MySQL/DB2/standard handlers. Instead let’s look once again at the fine print in the standard document.

Here is the test of “existing behaviour”.
delimiter //
drop procedure if exists p//
drop table if exists t1, t2//
create table t1 (s1 smallint)//
create table t2 (s1 smallint)//
create procedure p()
begin
declare continue handler for sqlwarning
create table t2 (s1 int);
declare continue handler for sqlexception
insert into t1 values (99999);
create table t1 (s1 int);
end//
call p()//
Result: “ERROR 1050 (42S01): Table ‘t2′ already exists”.
Apparently the exception handler’s INSERT was caught by the warning handler.

Is it a bug if a handler causes a condition and it’s caught by a handler within the same handler declaration list? I’ll call this “same level” handling rather than “recursive”.

Well, it does seem to be outside the standard requirements. I think that the following statements are “logic”. You can tell because I use the word “therefore” frequently.

Quotations are from SQL:2008 9075-4 draft.

From 4 4.8 Condition Handling:
“If there is no most appropriate handler and the condition is an exception condition, then the SQL-statement raising the exception
condition is terminated with that exception condition.”
Therefore, in order to be a handler for itself or for another handler at the same level, a handler must be “most appropriate”.

From 14.1 compound statement (edited for clarity)
”
Format:
<compound statement> ::=
BEGIN
[ <local handler declaration list> ]
[ <SQL statement list> ]
END
…
Syntax Rules:
1) Let CS be the <compound statement>.
16) The scope of a <handler declaration> simply contained in a <local handler declaration list> simply contained in CS is the <SQL statement list> SSL of CS excluding every <SQL schema statement> contained in SSL.
”
Therefore if the handler declaration list contains declarations of handler H1 (a warning handler) and handler H2 (an exception handler), then statements in H1 are not in the scope of declaration of H2, and statements in H2 are not in the scope of declaration of H1.

From 14.2 handler declaration (edited for clarity)
”
Format:
<handler declaration> ::=
DECLARE <handler type> HANDLER FOR <condition value list>
<handler action>
…
Syntax Rules:
1) Let HD be the <handler declaration>.
General Rules:
1) When the handler H associated with the conditions specified by HD is created, it is the most appropriate handler for any condition CN raised during execution of any SQL-statements that are in the scope of HD that has an SQLSTATE value or condition name that is the same as an SQLSTATE value or condition name associated with this handler, until H is destroyed.
”
The important phrase is surely “that are in the scope of HD”. Since H2’s statements are not within the H1 handler declaration’s scope, H1 cannot be a most appropriate handler for H2’s statements’ conditions.

Therefore the H1 handler cannot be most appropriate for H2 statements.

Therefore the above example is a violation of the SQL standard.

Question 3: The note and the warning

The answer is (a). But we think that for MySQL 6.x it should be (b).

Another of MySQL’s engineers, Davi Arnaut, posed the puzzle thus:
“MySQL warnings are not the same beast as the SQL standard warnings. I.e. they have a different life cycle, and they work with continue handlers differently, at least now. A MySQL note could be considered a subtype of MySQL warning, only with a different severity.”

Yes. There is some evidence that was the concept of the Ur-Engineer, Monty Widenius, back in 2002. He wrote “one should get a warning (severity=note) for each database / table that didn’t exists.”

Sometimes the MySQL Reference Manual says notes are a subtype of warning:
“warnings of Note level are recorded”
“Note warnings are suppressed”
“cause Note-level warnings not to be recorded”

But as Mr Arnaut said, sometimes the manual says they’re distinct from warnings:
“the error, warning, and note messages”
“the total number of errors, warnings, and notes”

I’d like to clear this up by writing an email to our documentation department:
“Please be consistent, say note-level warnings, not notes and warnings”.

The SQL standard is clear: there are only types of “completion conditions” success (class ‘00′) and warning (class ‘01′).
If we finally accept that notes are warnings, then the appropriate class is ‘01′, and it’s a bug if any note isn’t ‘01′.
So we’ll have to make several adjustments because at the moment the SQLSTATE for various notes isn’t alway in class ‘01′ ‘warning’.
This is what I see in 6.0 source code when I grep for cc files
containing “…_warning…(…WARN_LEVEL_NOTE…)”:

sql/sql_trigger.cc:         ER_NO_SUCH_USER, ER_TRG_DOES_NOT_EXIST,

                            ER_OLD_FILE_FORMAT

sql/event_parse_data.cc:    ER_EVENT_CANNOT_CREATE_IN_THE_PAST,

                            ER_EVENT_EXEC_TIME_IN_THE_PAST

sql/field.cc:               WARN_DATA_TRUNCATED ('01000'),

                            ER_WARN_DATA_OUT_OF_RANGE ('22003')

sql/item_subselect.cc:      ER_SELECT_REDUCED ('01000')

sql/sql_db.cc:              ER_DB_CREATE_EXISTS, ER_DB_DROP_EXISTS,

                            ER_BAD_DB_ERROR ('42000')

sql/event_db_repository.cc: ER_EVENT_ALREADY_EXISTS,

                            ER_SP_DOES_NOT_EXIST ('42000')

sql/sql_view.cc:            ER_NO_SUCH_USER, ER_BAD_TABLE_ERROR ('42S02'),

                            ER_WARN_VIEW_WITHOUT_KEY

sql/sql_insert.cc:          ER_TABLE_EXISTS_ERROR ('42S01')

sql/table.cc:               ER_VIEW_NO_CREATION_CTX,

                            ER_VIEW_INVALID_CREATION_CTX, ER_NO_SUCH_USER

sql/set_var.cc:             ER_UPDATE_LOG_DEPRECATED_TRANSLATED ('42000'),

                            ER_UPDATE_LOG_DEPRECATED_IGNORED ('42000')

sql/sql_parse.cc:           ER_NO_SUCH_USER, ER_SP_DOES_NOT_EXIST ('42000'),

                            ER_YES

sql/sql_yacc.cc:31818:      ER_NATIVE_FCT_NAME_COLLISION

sql/sql_repl.cc:            ER_MISSING_SKIP_SLAVE, ER_UNTIL_COND_IGNORED,

                            ER_SLAVE_WAS_RUNNING, ER_SLAVE_WAS_NOT_RUNNING,

                            ER_SLAVE_IGNORED_SSL_PARAMS

sql/sql_table.cc:           ER_BAD_TABLE_ERROR ('42S02'),

                            ER_DUPLICATED_VALUE_IN_TYPE, ER_AUTO_CONVERT,

                            ER_TABLE_EXISTS_ERROR ('42S01'), ER_ILLEGAL_HA,

                            ER_WARN_USING_OTHER_HANDLER

sql/item.cc:                ER_WARN_FIELD_RESOLVED

storage/maria/ha_maria.cc:  ER_ILLEGAL_HA_CREATE_OPTION

Unless otherwise specified, the associated SQLSTATE value is ‘HY000′. Only WARN_DATA_TRUNCATED and ER_SELECT_REDUCED have SQLSTATE class ‘01′.

The fix doesn’t mean that we should get rid of the note concept. Severity levels may be a standard de facto …
Oracle has “PL/SQL Warning Categories”
http://www.stanford.edu/dept/itss/docs/oracle/10g/appdev.101/b10807/07_errs.htm
… and DB2 has “information”, “warning”, “error”, “severe”
… and SQL Server has a severity level
http://blogs.msdn.com/sqlprogrammability/archive/2006/04/03/567550.aspx
… and NDB has severity levels for events: INFO, WARNING, ALERT, ERROR
http://dev.mysql.com/doc/refman/6.0/en/mysql-cluster-log-events.html

If the general idea is to ignore anything below a certain severity level, a continue handler intercept note-level warnings if and only if SQL_NOTES <> 0. Or maybe a syntax extension, e.g.
DECLARE x CONTINUE HANDLER FOR SQLWARNING SEVERITY_LEVEL > 1 …;
But that’s for after version 6.x, when MySQL decides what severity levels to have.

Posted in Tiny Tweaks | No Comments »

SQL Standards, ANSI committees, and Sun

July 13th, 2008

The two big developments in the SQL world this summer are the pending arrival of the SQL:2008 ANSI standard, and my (Peter Gulutzan’s) attendance on Sun’s behalf at an ANSI meeting. Okay, I concede that the events have unequal importance, but sometimes I slip in a bit about personal involvements.

What’s a standard?

Review: ANSI (which stands for “INCITS Technical Committee H2 on Database”), and its counterparts in many other countries, produce official standards for many things, including SQL. The international co-ordinating body is ISO (International Organization for Standards), so there’s an ANSI standard and an ISO standard, and they’re practically the same. There’s a partially-restricted site containing ISO papers here.

The current edition is ANSI/ISO SQL:2003. But SQL:2008 is only a few days or weeks away. I’m taking a risk here by jumping the gun and saying something about it this early, but I must take the risk, there’s a time constraint that forces me to state this week what we Sun folk know about SQL:2008 and what we’re doing about it.

Differences between SQL:2003 and SQL:2008

SQL is stable. We can expect new editions to have only small changes. As with any product, the changes are of two types: fixes and new features.

FIXES. Quite a few bugs are now corrected or at least some clarification has happened. They’re of no great importance now that they’re dead, but I’d like to (ahem) just observe that a small number of them were found by MySQL AB personnel. I’m proud that we played a part, however trivial, and glad that the standard editor took an “open source” attitude when deciding whether to incorporate the things I sent him.

NEW FEATURES. There’s a section at the end of Part II called Appendix F SQL feature taxonomy. By comparing the SQL:2003 and SQL:2008 files I was able to see that around 60 small non-core optional features are new. Everybody will have a different idea which ones matter. My own choice is the ones that look relevant to MySQL.

F123 all diagnostics. Allows “GET DIAGNOSTICS target = ALL”. Support for GET DIAGNOSTICS isn’t in our plans for real soon now, but it’s inevitable. The mysqlforge task description is WL#2111 Stored Procedures: Implement GET DIAGNOSTICS.
F200 TRUNCATE TABLE statement. See the earlier article “TRUNCATE now and unto ages of ages”.
F382 alter column data type. Allows “ALTER … SET DATA TYPE data type”. MySQL can already alter data type, but with non-standard syntax “ALTER … MODIFY …”.
F763 CURRENT_SCHEMA. Allows “SELECT CURRENT_SCHEMA …”. MySQL can already select this, but with non-standard syntax “SELECT DATABASE() …”. See the feature request in MySQL’s bugs database, “Please implement a CURRENT_DATABASE()/CURRENT_SCHEMA() function”.
F852 Top-level order by clause in views. Allows “CREATE VIEW … ORDER BY …”. MySQL can already do this.
F857 Top-level fetch first clause in query expression; F865 offset row count in result offset clause. Allows picking how many rows to select, and where to start, with “SELECT … OFFSET … FIRST | NEXT”. MySQL can already pick, but with non-standard syntax “SELECT … LIMIT …”.
T021 BINARY and VARBINARY data types. See the earlier article: “Tiny Tweak: BINARY and VARBINARY”.
T023 Compound binary literals. Allows concatenation of hexits in binary string literals. This isn’t in MySQL’s plans, though we’ve talked about it.
T213 INSTEAD OF triggers. If you design an INSTEAD OF trigger on a view, then delete from the view, the rows aren’t deleted but the trigger is activated — and the trigger might delete the rows that you want. We’ve had a customer request for this.
T285 Enhanced derived column names. This clears up some rules for automatic assignment of column names in a select list, when there’s no AS clause.

I’ll add as well that there’s a reference to the “current” Unicode standard. That’s not a change in the wording, but the meaning of “current” is now Unicode 5.1. Well, we’ve just added support for Unicode 5.0 in MySQL 6.0. Keeping up with other standards can muss our scheduling, but we’ve added a worklog task for Unicode 5.1 and we’ll see what happens to it. See the earlier article “MySQL 6.0 Feature #1: Supplementary Characters”.

Joining the committee

The work on the standard will continue, so that there will be fixes and feature changes for a new standard in a few years. The interesting question now facing us is: Will Sun Microsystems join ANSI?

A few weeks ago some committee members were kind enough to invite me to sit in as an observer for one of the regular ANSI meetings. I learned a lot about the mechanics and interrelationships (though I have no scoops about new happenings that aren’t public). It was gently noted a few times that here was a gathering of all the behemoths of the database industry, except one: Sun.

Well, it was natural for MySQL to stay out. (Time to confess now: we weren’t a behemoth, more a micro-behemoth.) Before the acquisition, we were a relatively small outfit and relatively cloutless. Sun is a different sort of organization, though. And Sun does join other committees, for example it plays a big role with Unicode and (duh!) Java standards. Equally importantly, Sun is serious about standards to an extent that some MySQL old-timers weren’t. Probably this has a lot to do with the influence of big enterprise customers in the new field that we’re playing in.

This attitude will I hope be useful in other points of MySQL 6.x development, and I’ll keep you posted about “MySQL and progress toward the SQL standard” in this blog. But I’m going to emphasize that the attitude has an effect on a lot more than MySQL. Sun does JavaDB. Sun even does PostgreSQL … it would be interesting to join the committee and act as a conduit for another open-source DBMS’s suggestions under Sun’s aegis. But any of Sun’s many PostgreSQL experts could do that better than a MySQL expert could.

I’m mildly warm to the concept of Sun joining. But now it’s time for a boilerplate disclaimer: “Any opinions in this blog do not represent the opinions of anyone official at Sun Microsystems.” The only thing I’m saying is that we’re thinking about it, and gathering opinions.

Posted in News | 4 Comments »

First days as a Sun Microsystems employee

July 8th, 2008

As of July 1 all Canadian MySQLers are employees of Sun Microsystems. I was expecting that we would be the rear guard, I was going to greet the great day singing “a broken man on an Edmonton pier / the last of Monty’s engineers” (with thanks to the writer of “Barrett’s privateers”), or perhaps “And I, the last, go forth companionless … Among new men, strange faces, other minds.” (with thanks to Mr Tennyson). But as things turned out, the Ukrainians and the French were delayed. So Trudy Pelzer and I were the last Canadian engineers to come on board, but not quite last of all.

And it’s been wonderful. A free lunch, a tour of Sun’s Edmonton office, a SafeWord security doohickey, a carrying case, and an alternative address = peter.gulutzan@sun.com. More seriously, I’ve gained several thousand new co-workers while keeping the old ones (almost none of the MySQL engineers didn’t join Sun, and I know of none that intend to leave now that they’ve joined Sun). This is one of those friendly acquisitions where everything has just worked, and my hat is off to the integration teams who’ve kept us all in a high-morale gung-ho cloud. The new company is an even better friend of open source than the old, the number of intelligent people I’ve encountered is even greater, and the European level of politeness — always one of MySQL AB’s best points — hasn’t been polluted at all, touch wood. I highly recommend a Sun job for connoisseurs of tasteful engineering positions.

Posted in Personal Opinion | 3 Comments »

Bazaar Branches of MySQL 6.0

June 29th, 2008

Bazaar Branches of MySQL 6.0

Now that MySQL has switched to using bazaar for source code repositories, an in place to go for 6.0 public code is Canonical’s “launchpad” site, specifically “Bazaar Branches related to Sun/MySQL Engineering” at https://code.launchpad.net/~mysql. The Bazaar Branches page has a list. Each row in the list is for a source-code package with all the MySQL files necessary to build the server, as described in the MySQL Reference Manual chapter MySQL Installation Using a Source Distribution. For 6.0 lovers, the interesting entries in the Bazaar Branches list are mysql-6.0, mysql-6.0-backup, mysql-6.0-falcon, mysql-6.0-rpl, mysql-6.0-wl3220, mysql-maria.

how to read the list
The Bazaar Branches page’s instructions look good enough to me. But comments in the “Last commit” column might have MySQL-specific abbreviations that I’d like to warn about. Specifically:
* A comment like “Fix to Bug#37679 …” will not refer to the bugs in Canonical’s launchpad.net site, but to MySQL’s own database of bugs: http://bugs.mysql.com.
* A comment containing the initials “WL”, for example “WL#4398 fix compile error …” or “WL3220: Loose index scan …”, is referring to a WorkLog task number. The public worklog tasks are on http://forge.mysql.com. Not all worklog tasks are public.

mysql-6.0
We call mysql-6.0 the “main” tree. When 6.0 becomes generally available, every feature will be in mysql-6.0. Probably this is the only branch you’ll need, unless you are a fanatical follower of a particular team’s progress on a particular day. The code here might be more recent, and more experimental, than what’s in the tar file on the “MySQL 6.0 Downloads” page.

mysql-6.0-backup
I had a long article on BACKUP and RESTORE a few weeks ago. A few changes since then: (a) some people from “Sun Classic” have joined the team (yay, Sun integration!); (b) the progress data is now in true “logs” so there are some options for viewing them as tables or files, turning them on and off, etc., the same way that it’s done with query log or general log.

mysql-6.0-rpl
The letters “rpl” stand for “replication”. Don’t ask me why, maybe this is the way that Swedes do abbreviations. (MySQL AB was originally a Swedish company, and two of the prominent replication people, Mats Kindahl and Lars Thalmann, are from places near Stockholm and Uppsala.)

mysql-6.0-falcon
Elsewhere I’ve seen statements like “Falcon is currently only supported within a special fork of the MySQL release, termed MySQL-6.0-falcon.” That’s false. All mysql-6.0 packages include Falcon. The distinguishing feature of mysql-6.0-falcon is that it has recent changes done by Falcon team members which will be merged with the other trees after a delay of perhaps a few weeks.

mysql-6.0-wl3220
The suffix ‘wl3220″ means “this branch contains the experimental code for worklog task 3220″. I mentioned before that some worklog tasks are public. You can find the description for this one at this location:
http://forge.mysql.com/worklog/task.php?id=3220.
The task is to speed up COUNT(DISTINCT column_name) and similar functions. At the moment that I’m writing this, it isn’t working yet.

mysql-maria
I included this in a list of 6.0 “sites worth seeing”, even though, if you say
SELECT VERSION();
you’ll see “5.1.25-maria-alpha”. I’m not sure that the maria storage engine is to be released along with version 5.1, it now seems more likely that it will be in 6.0 or 6.1. But it’s a little early to state what the Maria storage engine will be part of.

Other branches
Other branches exist internally, for example
mysql-6.0-wl2110 SIGNAL/RESIGNAL (all parts working now)
mysql-6.1-fk Foreign keys (parser and metadata storage working now)
Eventually these branches will either appear on the Bazaar Branches page, or get merged into one of the other branches before becoming public. Keep watching the page. You’ll probably see frequent changes.

Posted in MySQL 6.0 New Features | No Comments »

Extended comments

June 29th, 2008

You’ve probably seen the MySQL Reference Manual page “What’s New in MySQL 6.0″ and so you’ve seen this description of a new feature: “Support for extended comments for tables, columns, and indexes.” Here’s what “extended” means.

You can do a little more with COMMENTs in ALTER or CREATE statements. The easy way to see the difference is by running this statement in both MySQL 5.1 and MySQL 6.0:

SELECT table_schema, table_name, column_name, character_maximum_length

FROM information_schema.columns

WHERE column_name LIKE '%comment';

From that, you’ll see that the differences are:

TABLE_NAME     COLUMN_NAME      CHARACTER_MAXIMUM_LENGTH CHARACTER_MAXIMUM_LENGTH

                                in MySQL 5.1             in MySQL 6.0

COLUMNS        COLUMN_COMMENT   255                      1024

PARTITIONS     PARTITION_COMMENT 80                        80

TABLES         TABLE_COMMENT     80                      2048

STATISTICS     INDEX_COMMENT    n/a                      1024

In other words, the maximum length of a column or table comment is muchly increased, and it’s now possible to have index comments too. Partition comment length is changed too, but you can’t see it yet (Bug#37728). This change was for a special request from one of MySQL’s partners.

Posted in Tiny Tweaks | 1 Comment »

TRUNCATE now and unto ages of ages

June 9th, 2008

MySQL’s TRUNCATE statement will evolve over the next few years, as we fix bugs or strange behaviours, and as we try to make storage engines more consistent with each other. Equally interesting are the aspects that will remain the same, as we confirm that our behaviour conforms to standard requirements.

Define TRUNCATE

TRUNCATE T has two definitions:
(1) it means “delete all rows in T”, as the MySQL Reference Manual says in section “TRUNCATE Syntax”. (2) it means “drop T, then create it again”, as the Reference Manual says later on. The better definition is (2). I have proofs.

Proof #1: The required privilege for TRUNCATE is DROP.

Proof #2: TRUNCATE does not cause activation of DELETE triggers. For example:
CREATE TABLE t (s1 INT);
CREATE TRIGGER t_ad AFTER DELETE ON t
FOR EACH ROW SET @a = @a + 1;
SET @a = 0;
INSERT INTO t VALUES (1),(2),(3);
TRUNCATE TABLE t;
Result: @a = 0, which shows that trigger t_ad was never activated.

Proof #3: TRUNCATE ignores locks, DELETE doesn’t. For example:
On Connection#1, say:
set @@autocommit=0;
create table t1 (s1 int, s2 blob) engine=innodb;
insert into t1 values (1,’a'),(2,’b'),(3,’c');
commit;
truncate table t1;
On Connection #2, say:
set @@autocommit=0;
update t1 set s2 = ”;
Result: the truncation succeeds, it is not blocked by the INSERT statement. If Connection#1 had said DELETE instead of TRUNCATE, there would be a lock. This is actually a danger point with TRUNCATE, by the way: you cannot expect locking to happen the same way that it would happen for DELETE statements.

Proof #4: TRUNCATE is much faster than DELETE if there are many rows in the table. For example:
CREATE TABLE t (s1 INT);
/* insert many rows */
DELETE FROM t;
/* insert many rows */
TRUNCATE TABLE t;
I timed the DELETE, it took several seconds. I timed the TRUNCATE, it took less than one second. This speed difference is what I would expect of a statement that affects only metadata, not rows.

There is no interesting side effect that’s due to TRUNCATE’s “drop and recreate” method, namely freeing of space in an underlying file. For example:
USE test
CREATE TABLE t (s1 INT);
/* insert many rows */
/* Look at size of underlying file in datadir/test */
DELETE FROM t;
DROP TABLE t;
CREATE TABLE t (1 INT);
/* insert many rows */
/* Look at size of underlying file in datadir/test */
TRUNCATE TABLE t;
Result: DELETE and TRUNCATE have the same effect on the underlying file. If the table was made with MyISAM, the file size becomes 0, either way. If the table was made with InnoDB and Per-Table Tablespaces, the file size is unchanged, either way. But in theory this could change, giving another “efficiency” reason to use TRUNCATE.

Against the advantages, one has to lay the things that are a little peculiar. They’re documented, but frankly the MySQL Reference Manual is getting a bit clogged with all the documentations of unintuitive non-obviosities.

The SQL:2008 Standard

In December 2007 the SQL standard committee published a draft document for the next standard, which I’m tentatively calling SQL:2008 in case it comes out in 2008. The draft document has new sections which describe non-core (that is, optional) features F200 and F202, which amount to support of the TRUNCATE statement.

TRUNCATE TABLE base_table_name
[ CONTINUE IDENTITY | RESTART IDENTITY ];

Differences between SQL:2008 and current MySQL implementation:

1. SQL:2008 allows
[ CONTINUE IDENTITY | RESTART IDENTITY ]
The default is CONTINUE IDENTITY.
MySQL has no such clause, and it nowadays (after Bugs 1514, 5033, 11946, 14945, 15754) handles auto_increment by sayng that TRUNCATE TABLE causes “restart”. It’s still not working correctly with partitions, see Bug#35111 ‘Truncate a MyISAM partitioned table does not reset the auto_increment value”.

2. SQL:2008 says TRUNCATE TABLE’s function is to “delete all rows” and TRUNCATE TABLE is a data-change statement, not a schema-change statement. MySQL says TRUNCATE TABLE is “DDL” because there’s an implied DROP TABLE, which means we treat it as a schema change.

There’s actually no problem about the main issue: The standard says “… all rows are deleted from T” but it doesn’t refer to sections that would cause execution of triggers, which means the effect is not what you’d expect with DELETE, but what you’d expect from drop + create. However, the fact that MySQL thinks TRUNCATE is DDL (vaguely like the standard’s “schema change” classification), and the tendency to autocommit, are certainly differences.

3. SQL:2008 says TRUNCATE TABLE is illegal on a referenced table of a foreign key. MySQL allows it for InnoDB, and WL#148 (the worklog task for foreign keys in MySQL version 6.1) disallows it for all tables where foreign keys are involved.

So right now we’re allowing something that we shouldn’t, and in future we won’t allow something that we should.

4. SQL:2008 says TRUNCATE TABLE should return “no data” if there were no rows in the table. MYSQL returns “okay” just as it does for DELETE. This is WL#4194 “Standard handling of ‘no data’”.

This relates as well to MySQL’s display of the number of affected rows.

5. SQL:2008 says TRUNCATE TABLE table_name. MySQL says TRUNCATE [TABLE] table_name.

6. SQL:2008 says TRUNCATE is a reserved word. MySQL says it’s not.

By the way, you’ll generally find that words that begin MySQL-server statements are reserved words, but besides TRUNCATE there are other exceptions: BACKUP BEGIN CACHE CHECKSUM COMMIT DEALLOCATE DO EXECUTE FLUSH HANDLER HELP INSTALL PREPARE REPAIR RESET RESTORE SAVEPOINT SLAVE START STOP UNINSTALL XA. I’ve seen a suggestion that TRUNCATE should be special, see Bug#12074 “TRUNCATE keyword not hilighted like SELECT, INSERT, etc.”.

7. SQL:2008 says the required privilege is “be the schema owner”. MySQL says the required privilege is DROP.

8. SQL:2008 says there is no trigger action. MySQL usually has no trigger action, but there’s a bug, Bug##27935 “TRUNCATE fires DELETE trigger if InnoDB table is referenced by a foreign key”.

All your standards are belong to us

I’ve just noticed that my future boss, Sun Microsystems CEO Jonathan Schwartz, has started off one of his blog posts by saying
“De facto standards are the only ones that matter.”
This is my opportunity to say “Brilliantly put, boss.” I’ll only add that the de facto standard (what all the SQL vendors do) and de jure standard (what SQL:2008 says) tend to converge.

I can see that because, in this wonderful age when every company puts its documentation online, I can act like a grocery-store price checker and wander among the competitor’s shelves. I can find TRUNCATE’s description in the Oracle 11g manual, the SQL Server 2008 manual, the Informix manual, the PostgreSQL 8.3 manual, and the DB2 Version 9.1 for z/OS manual. Of course, I find that there are details where the other vendors don’t follow the standard, just as MySQL doesn’t, see above. But in general, and in many of the details, I find that de facto = de jure.

I’ve seen this so often, and probably will devote a whole column to the subject someady … Convergence happens. Sometimes convergence happens because individual vendors commit themselves to adhering to an established standard feature. Sometimes the standard committee tries to reflect what goes on already, which is what’s happening here. In any case, the reason is rarely “academic”. TRUNCATE is hardly the sort of feature that one would add because of orthogonality or relational theory.

Even when the new standard becomes official, I doubt that the material on TRUNCATE will muchly affect MySQL unless users make requests. If any of the above are really burning issues for your installation, you can fill in a feature request form on bugs.mysql.com. Some of the TRUNCATE-related feature requests that I’ve seen already are
Bug#5507 TRUNCATE should work with views
Bug#17092 allowing TRUNCATE TABLE while LOCKing done
Bug#19405 No way to truncate a partition.
We have in-development plans for other tasks that mention TRUNCATE adjustments:
WL#3241 Truncate table in Maria
WL#2418 DDL Triggers (including perhaps a TRUNCATE trigger)
WL#3821 TRUNCATE for VIEWs (including perhaps a TRUNCATE privilege)
WL#3129 Consistent clauses in CREATE and DROP (including perhaps TRUNCATE IF EXISTS table1, table2 …)

Posted in Personal Opinion | 1 Comment »

« Previous Entries

New Features In MySQL 6.x is proudly powered by WordPress MU running on Blogs.mysql.com.
Entries (RSS) and Comments (RSS).