Home > Drizzle, GSoC 2010 > I just finalized a first version of the …

I just finalized a first version of the …

I just finalized a first version of the query cache plugin for Drizzle. The good news is that the version is functional (YES you can try it out) but I guess it’s just a starting point for finding bugs and adding the new stuff in , in short … make it really functional.

In a previous article I have exposed the plugin query cache interface and hooks in the Drizzle kernel. Few modifications were made to that part and I will talk about that and all the technical details in a next article, for now I am just going to give a “how-to” to get you started using the plugin.

First we need to get the branch where the latest code is pushed from my launchpad directory with the command:

$ bzr branch lp:~dedzone/drizzle/query-cache-gsoc

The plugin has two dependencies, Memcached and libmemcached, so make sure have these two installed before going further otherwise the compilation would fail. Then compile the branch and specify the query cache plugin (it’s not by default):

$ ./config/autorun.sh
$ ./configure --with-memcached-query-cache-plugin
$ ./make -j2 && make install

If you wish you can run the test plugin’s test suite (try to have a memcached instance runing on localhost:11211, if the test poses problems)

$./tests/dtr --suite=memcached_query_cache

After that you would have an operational Drizzle server, and since we are going to use the memcached query cache plugin, we will also need a runing instance of memcached (it can be local or remote). Note that the option –add-plugin would cause our plugin to get loaded, again it’s not loaded by default, and don’t forget to create a directory for your data exp: /home/user/db

$ memcached -l 127.0.0.1 -p 11211 -d
$ drizzled --datadir=/home/user/db/ --add-plugin=memcached_query_cache

Now that we are all set, lets try out the memcached query cache plugin by starting a new client (drizzle c++ client or boots) and connect to drizzle. At this point the plugin is loaded and pointing to a default memcached server localhost:11211, if you have a different configuration you can change it dinamically with

drizzle> set global query_cache_servers="localhost:11311 remotehost:11211"

You can also set the expiration time of a memcached entry (default is 0, never), that is a global parameter that everybody will be bound to, but can also be changed dynamically.

drizzle> set global query_cache_expiry= 1000;

So, where is the fun part? I have launched a discussion on Drizzle mailing list to decide on the syntax to cache or not a query. My very first orientation was to extend the sql syntax to and SQL_CACHE and SQL_NO_CACHE hints, but (thanks to all the contributor)  switched to a server command that will switch the on or off the cache on demand and per Session.

drizzle>set query_cache_enable= on;

Now your next Select statement will be cached and if you execute the exact same command (including spacing) the results you will get will come out from memcached \o/. Well, not totally true, a Select statement to be cached has to be cacheable, i.e. time functions, random, user, database .. are not deterministic, thus must not be cached. Also user defined functions are not supported (we just don’t know what you’ve done out there), and queries referring to Data_dictionary schema, that will make the server stuck in a time point … not good :)

But, How would you know if your query is really cached! there is in fact a local cache structure that stores the current cached queries that you can freely query

drizzle> select * from emp;
+----------+------+
| name     | age  |
+----------+------+
| ded      |   28 |
| ded2     |   27 |
| vjsamuel |   19 |
+----------+------+
3 rows in set (0 sec)

drizzle> select * from emp where name like "ded";
+------+------+
| name | age  |
+------+------+
| ded  |   28 |
+------+------+
1 row in set (0 sec)

drizzle> select * from emp where age < 25;
+----------+------+
| name     | age  |
+----------+------+
| vjsamuel |   19 |
+----------+------+
1 row in set (0 sec)

drizzle> select sysdate() from emp where age < 25;
+---------------------+
| sysdate()           |
+---------------------+
| 2010-08-16 03:19:52 |
+---------------------+
1 row in set (0 sec)

drizzle> select * from data_dictionary.query_cache_entries;
+----------------------------------+--------+-----------------------------------------+
| key                              | schema | sql                                     |
+----------------------------------+--------+-----------------------------------------+
| 376555ba10dae3d09ca6df52a0839be6 | cool   | select * from emp where name like "ded" |
| b1a551e4aa3f717798380e41dcee0960 | cool   | select * from emp                       |
| e2e2402c2b29ac0c2fad1769a31d2d0a | cool   | select * from emp where age < 25        |
+----------------------------------+--------+-----------------------------------------+
3 rows in set (0.01 sec)

Here we issued 5 select statement and the 3 of them is in the cache except the ones calling sysdate(), or the data_dictionary. The hash key that you see is an md5 of your query, it’s the sole identification of your query cache in memcached and locally. One can play and look at the local cache content (It is a lovely protobuff message) with the following function:

drizzle> select print_query_cache_meta("e2e2402c2b29ac0c2fad1769a31d2d0a");

key: "e2e2402c2b29ac0c2fad1769a31d2d0a"
schema: "cool"
sql: "select * from emp where age < 25"
select_header {
table_meta {
schema_name: "cool"
table_name: "emp"
}
field_meta {
field_name: "name"
field_alias: "name"
table_name: "emp"
table_alias: "emp"
schema_name: "cool"
}
field_meta {
field_name: "age"
field_alias: "age"
table_name: "emp"
table_alias: "emp"
schema_name: "cool"
}
}

That’s nice, but what happen if the data cached gets tempered with (DML, DDL) ! don’t worry, the invalidation will take that in charge, in fact the plugin will keep a track of all the tables referenced in a query and will build a table similar to query_cache_entries, but will contain the reverse information, thus the table and all the entries of the cache making reference to it, that will avoid the burden of iterating overall the cache.

drizzle> select * from data_dictionary.query_cached_tables;
+---------+--------------------------------------------------------------------------------------------------------+
| Table   | Cache_Keys                                                                                             |
+---------+--------------------------------------------------------------------------------------------------------+
| coolemp | ::b1a551e4aa3f717798380e41dcee0960::376555ba10dae3d09ca6df52a0839be6::e2e2402c2b29ac0c2fad1769a31d2d0a |
+---------+--------------------------------------------------------------------------------------------------------+
1 row in set (0 sec)

Next, if one will change (or drop) the content of the table emp in the schema cool the replication system will cause the invalidation of all the cache entries  of the cache:

drizzle> delete from emp where name like "ded1";
Query OK, 1 row affected (0 sec)
drizzle> select * from data_dictionary.query_cached_tables;
Empty set (0 sec)
drizzle> select * from data_dictionary.query_cache_entries;
Empty set (0 sec)

Final capability is the query_cache_flush() function, it will basically wipe out the whole content the local cache and the content of the memcached instances registred (make sure your instances are not used for another purpose)

drizzle> select query_cache_flush();
+---------------------+
| query_cache_flush() |
+---------------------+
|                   1 |
+---------------------+
1 row in set (0 sec)
drizzle> select * from data_dictionary.query_cached_tables;
Empty set (0 sec)
drizzle> select * from data_dictionary.query_cache_entries;
Empty set (0 sec)

Conclusion:

That was a small kinda manual to get you started with the query cache plugin, I hope to extend the functionalities to support fine grained invalidation in the near future. But for now there is still many work to be done to get rid of all the withstanding problems.

About these ads
Categories: Drizzle, GSoC 2010
  1. August 16, 2010 at 6:22 pm | #1

    Wonderful work so far, Djellel!

    Great to see the prototype in action :)

    -jay

  2. August 19, 2010 at 10:02 am | #2

    This is beautiful and powerful.
    Great Job!

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: