Release of OLTP-Bench
After several months of development, I am happy to announce the official release of OLTP-Bench, an extensible “batteries included” DBMS benchmarking testbed ! This project is ought to be an aggregator of popular and research oriented OLTP benchmarks. It provides a portable framework for workload generation and an API for integrating benchmark queries. Besides it uses JDBC API which allows you to connect to any DBMS systems with a proper driver.
OLTP-Bench has modular architecture for hooking new benchmarks, hopefully I’ll write a detailed how-to guide but for now you can already have a look at the implemented benchmarks to have an idea on how to write your queries and use the workload generator. We ported several popular and interesting benchmarks with varying complexity and domain application, it includes: TPCC-like, TATP, SEATS, AuctionMark, YCSB, Wikipedia, Twitter, JPAB, Epinions and Resource Stresser. More information on each benchmark is available here.
The workload generator is driven by an XML configuration file; users have to define phases of execution composed of a target rate (expressed in transactions per seconds), the duration to apply the rate and also the weight of each procedure (or query) of the benchmark. By combining phases one can simulate very complex situations to stress and test the database system. Doing so we have conducting hundreds of experiments on different systems and configuration more details are available here.
Hopefully this will get the database community excited as our goal was not to write “Yet another benchmark” but rather engage everyone to share their configuration and results.
Memory leaks in Java
Programming in Java relieves from a certain burden of self managing memory allocation/de-allocation, however, that doesn’t mean completely overlooking this aspect at the risk of creating memory ogres instead of programs.
The JVM will try to allocate memory up to the max memory heap you specify, this pool of memory is reclaimed by the Garbage Collector once your instances are dereferenced. Obviously if you keep on creating objects you’ll fill up your memory and the JVM will throw an exception!
java.lang.OutOfMemoryError: Java heap space
Such errors are quite easy to spot with a debugger; The real fun starts when you’ve got a memory leak in your program (or a bug in a 3rd party library you use) which will only appear under some heavy load and long runs. Finding the error might be tedious because that involves trying to reproduce it at first, analyzing memory maps, sort what is normal and what’s not; fortunately there is a plenty of tools out there to help.
Monitor your application with Jconsole: This monitoring utility could be very useful to spot patterns of memory growth in your application. Fist pass a couple of options to your JVM to activate JMX remote monitoring, for example here I open the 7777 port and disable all security (you might want to activate it .. though ![]()
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=7777 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
Then from a remote machine launch Jconsole client using jconsole host:port
Produce memory heaps by passing the following parameter to the JVM:
-XX:+HeapDumpOnOutOfMemoryError
Use Jmap to create a dump on the fly i.e: while the application is running.
Look at the content of the Heap: In order to examine the heap files you produce you need to read them with a specific tool. Jhat is a small utility that reads the heap file and launces a web browser from which you can browse its content, you’ll find: current objects, their references, size, hiarchy and so on. A histogram summary might be the first thing you want to look at since it gives the list of classes ordered by number of instances and memory usage.
So far all those tools are shipped with latest versions of JVM, now if you want a more detailed insight or automatic leak suspect report, there is some free tools like: Eclipse Memory Analyzer tool and Netbeans profiler.
All the tools listed here were usually enough for me to find memory leaks. Although a bit tedious, my strategy is to create heap dump files on the fly at different moments and then examine the differential.
This week we’ve released two of our current…
This week we’ve released two of our current projects as open sources, YeY! The first is TSQL Parser, an extension of JSQLParser to support temporal SQL extension and hence, query temporal databases. The parser also supports query rewriting, ie: translating TSQL into standard SQL with added valid transaction timestamps [tstart-tend]. The other related project is Temporal JDBC Proxy, which basically wraps your driver’s connection and seamlessly make it support TSQL. Current supported dbms are MySQL and Postgres!
KMeans on MAHOUT
Lately I’ve been mainly working on parallel computing, especially that I have access to this powerful LONI supercomputers with hundred of nodes and lots of memory. Specifically I wanted to compare MPI, openmp and Hadoop, both in terms of efficiency and ease of use.
K-means being my favorite easy to implement clustering algorithm, I first tried to modify my C++ code and exploit the Hadoop Pipes library to implement a map reduce program. I confess it was the first time for me writing some Hadoop code, but when you read about it the word “easy” pops a lot. If your program differs from wordcount, don’t get fooled. Rethinking an algorithm in M/R is essentially the first thing to do on paper, however those little chunk of data and their movement, quickly becomes a nightmare to implement, especially that the only documentation is word count. How about having multiple types of inputs files? specific format? and algorithm that iterates taking the output from a previous iteration?
Well definitely I dropped the idea of using the C++ Pipes, and jumped to a machine learning library Mahout that implements a number of clustering algorithms including K-Means. I cracked open the Mahout’s code and started to discover what is really needed to implement M/R code on Hadoop. While I am not going to cover that here, a word of advice, if you are a beginner, stick to Java and get you a book.
Using Mahout saves you the time to write the desired algorithm in M/R but still requires that you write some code to transform your input file into something it can understand, namely SequenceFile, and this is the input preparation phase. I really cannot elaborate more about that for now because in my case I found a work around! so basically if you use command line, and have a valid transformed sequence file as your input for both your data points and a self generated clusters (unless you specify k):
$ bin/mahout kmeans -i input -c clusters -o output -k num_clusters -dm measure -x maxIter -cd convergence_delta
Mahout provides some examples to run the corresponding implementation on a predefined raw datasets. For Kmeans, it has this example of syntheticcontrol data, where the input is space delimited. Now, depending on what you want to run you can use the provided examples but make sure the data format used corresponds to yours !
In the Kmeans example case you should have this kind of input: (space delimited vectors)
35.5351 41.7067 39.1705 48.3964 .. 38.6103
24.2104 41.7679 45.2228 43.7762 .. 48.8175
You still need to push this data to the HDFS under input directory for example:
$ $MAHOUT_HOME/bin/mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job -i input -o output -t1 .. -t2 .. -cd .. -x 10
Lets examine the options: -cd is the convergence delta, -x is the max iteration. Like me you’d legitimately ask “but, where is kmeans’ K?” well it seems that this example runs the Canopy algorithm using the t1 and t2 distances and generates the initial cluster … I don’t want that! what I need is to provide k (integer) as an input and randomly select K points from my clusters.
Remember in the command line mode you have the K parameter, so there should be a Random Generator code somewhere! After reading the corresponding piece of code I have finally understood where it is and how to use it
. I figured I can push my code to mahout so that others can use it directly, but basically I created my own Job class based on the syntheticcontrol and called it generic, then I’ve made the following modification to the source code:
$MAHOUT_HOME/examples/src/main/java/org/apache/mahout/clustering/generic/kmeans/Job.java
Path directoryContainingConvertedInput = new Path(output, DIRECTORY_CONTAINING_CONVERTED_INPUT);
log.info("Preparing Input");
InputDriver.runJob(input, directoryContainingConvertedInput, "org.apache.mahout.math.RandomAccessSparseVector");
// log.info("Running Canopy to get initial clusters"); // Old code using CanopyDriver
// CanopyDriver.run(conf, directoryContainingConvertedInput, output, measure, t1, t2, false, false); // Old Code calling the Canopy Driver
log.info("Running random seed to get initial clusters");
Path clusters= new Path(output, Cluster.INITIAL_CLUSTERS_DIR);
clusters = RandomSeedGenerator.buildRandom(directoryContainingConvertedInput, clusters, k, measure);
log.info("Running KMeans");
KMeansDriver.run(conf,
directoryContainingConvertedInput,
clusters,
output,
measure,
convergenceDelta,
maxIterations,
true,
false);
If you are interested you can check the code of the InputDriver that convert the space separated file into a SequenceFile, and the RandomSeedGenerator that takes the converted input and extract k random vectors and put them into clusters directory.
Finally, if you want to run Kmeans using Mahout and you have a space separated input file you can use my example code, (waiting for the merge here is the patch: patch ):
$MAHOUT_HOME/bin/mahout org.apache.mahout.clustering.generic.kmeans.Job -i input -o output -k .. -cd .. -x 10
Bottom line, you really need to read some to write some useful Hadoop M/R code in order to grasp the idea of sequencefiles, what interfaces your classes should implement and what not! a good place to start are the example packages and Mahout.
The last weekend I have received my vers…
The last weekend I have received my version of “Effective C++” 3rd edition by Scott Myers. That book was recommended to me by Jay and Toru. It offers a number of best practices in C++ programming (55 items) so far it’s very informative and the author really addresses each point from all perspectives and explains why things should be the way he says so.
One point, however, I regret not finding THE section I was looking for, the dark side of C++ “pointers”, the author talks a lot about smart pointers as the best way to avoid memory leaks. To be honest I’ve never used them, but my real concern was to learn aspects of allocating memory from the heap, when to do it and what are the best practice, not pointers as a “variable” so to speak.
In general I am satisfied with the book and in fact I need to stop reading and start refactoring some of my code
I just finalized a first version of the …
I just finalized a first version of the query cache plugin for Drizzle. The good news is that the version is functional (YES you can try it out) but I guess it’s just a starting point for finding bugs and adding the new stuff in , in short … make it really functional.
In a previous article I have exposed the plugin query cache interface and hooks in the Drizzle kernel. Few modifications were made to that part and I will talk about that and all the technical details in a next article, for now I am just going to give a “how-to” to get you started using the plugin.
First we need to get the branch where the latest code is pushed from my launchpad directory with the command:
$ bzr branch lp:~dedzone/drizzle/query-cache-gsoc
The plugin has two dependencies, Memcached and libmemcached, so make sure have these two installed before going further otherwise the compilation would fail. Then compile the branch and specify the query cache plugin (it’s not by default):
$ ./config/autorun.sh
$ ./configure --with-memcached-query-cache-plugin
$ ./make -j2 && make install
If you wish you can run the test plugin’s test suite (try to have a memcached instance runing on localhost:11211, if the test poses problems)
$./tests/dtr --suite=memcached_query_cache
After that you would have an operational Drizzle server, and since we are going to use the memcached query cache plugin, we will also need a runing instance of memcached (it can be local or remote). Note that the option –add-plugin would cause our plugin to get loaded, again it’s not loaded by default, and don’t forget to create a directory for your data exp: /home/user/db
$ memcached -l 127.0.0.1 -p 11211 -d
$ drizzled --datadir=/home/user/db/ --add-plugin=memcached_query_cache
Now that we are all set, lets try out the memcached query cache plugin by starting a new client (drizzle c++ client or boots) and connect to drizzle. At this point the plugin is loaded and pointing to a default memcached server localhost:11211, if you have a different configuration you can change it dinamically with
drizzle> set global query_cache_servers="localhost:11311 remotehost:11211"
You can also set the expiration time of a memcached entry (default is 0, never), that is a global parameter that everybody will be bound to, but can also be changed dynamically.
drizzle> set global query_cache_expiry= 1000;
So, where is the fun part? I have launched a discussion on Drizzle mailing list to decide on the syntax to cache or not a query. My very first orientation was to extend the sql syntax to and SQL_CACHE and SQL_NO_CACHE hints, but (thanks to all the contributor) switched to a server command that will switch the on or off the cache on demand and per Session.
drizzle>set query_cache_enable= on;
Now your next Select statement will be cached and if you execute the exact same command (including spacing) the results you will get will come out from memcached \o/. Well, not totally true, a Select statement to be cached has to be cacheable, i.e. time functions, random, user, database .. are not deterministic, thus must not be cached. Also user defined functions are not supported (we just don’t know what you’ve done out there), and queries referring to Data_dictionary schema, that will make the server stuck in a time point … not good
But, How would you know if your query is really cached! there is in fact a local cache structure that stores the current cached queries that you can freely query
drizzle> select * from emp;
+----------+------+
| name | age |
+----------+------+
| ded | 28 |
| ded2 | 27 |
| vjsamuel | 19 |
+----------+------+
3 rows in set (0 sec)
drizzle> select * from emp where name like "ded";
+------+------+
| name | age |
+------+------+
| ded | 28 |
+------+------+
1 row in set (0 sec)
drizzle> select * from emp where age < 25;
+----------+------+
| name | age |
+----------+------+
| vjsamuel | 19 |
+----------+------+
1 row in set (0 sec)
drizzle> select sysdate() from emp where age < 25;
+---------------------+
| sysdate() |
+---------------------+
| 2010-08-16 03:19:52 |
+---------------------+
1 row in set (0 sec)
drizzle> select * from data_dictionary.query_cache_entries;
+----------------------------------+--------+-----------------------------------------+
| key | schema | sql |
+----------------------------------+--------+-----------------------------------------+
| 376555ba10dae3d09ca6df52a0839be6 | cool | select * from emp where name like "ded" |
| b1a551e4aa3f717798380e41dcee0960 | cool | select * from emp |
| e2e2402c2b29ac0c2fad1769a31d2d0a | cool | select * from emp where age < 25 |
+----------------------------------+--------+-----------------------------------------+
3 rows in set (0.01 sec)
Here we issued 5 select statement and the 3 of them is in the cache except the ones calling sysdate(), or the data_dictionary. The hash key that you see is an md5 of your query, it’s the sole identification of your query cache in memcached and locally. One can play and look at the local cache content (It is a lovely protobuff message) with the following function:
drizzle> select print_query_cache_meta("e2e2402c2b29ac0c2fad1769a31d2d0a");
key: "e2e2402c2b29ac0c2fad1769a31d2d0a"
schema: "cool"
sql: "select * from emp where age < 25"
select_header {
table_meta {
schema_name: "cool"
table_name: "emp"
}
field_meta {
field_name: "name"
field_alias: "name"
table_name: "emp"
table_alias: "emp"
schema_name: "cool"
}
field_meta {
field_name: "age"
field_alias: "age"
table_name: "emp"
table_alias: "emp"
schema_name: "cool"
}
}
That’s nice, but what happen if the data cached gets tempered with (DML, DDL) ! don’t worry, the invalidation will take that in charge, in fact the plugin will keep a track of all the tables referenced in a query and will build a table similar to query_cache_entries, but will contain the reverse information, thus the table and all the entries of the cache making reference to it, that will avoid the burden of iterating overall the cache.
drizzle> select * from data_dictionary.query_cached_tables;
+---------+--------------------------------------------------------------------------------------------------------+
| Table | Cache_Keys |
+---------+--------------------------------------------------------------------------------------------------------+
| coolemp | ::b1a551e4aa3f717798380e41dcee0960::376555ba10dae3d09ca6df52a0839be6::e2e2402c2b29ac0c2fad1769a31d2d0a |
+---------+--------------------------------------------------------------------------------------------------------+
1 row in set (0 sec)
Next, if one will change (or drop) the content of the table emp in the schema cool the replication system will cause the invalidation of all the cache entries of the cache:
drizzle> delete from emp where name like "ded1";
Query OK, 1 row affected (0 sec)
drizzle> select * from data_dictionary.query_cached_tables;
Empty set (0 sec)
drizzle> select * from data_dictionary.query_cache_entries;
Empty set (0 sec)
Final capability is the query_cache_flush() function, it will basically wipe out the whole content the local cache and the content of the memcached instances registred (make sure your instances are not used for another purpose)
drizzle> select query_cache_flush();
+---------------------+
| query_cache_flush() |
+---------------------+
| 1 |
+---------------------+
1 row in set (0 sec)
drizzle> select * from data_dictionary.query_cached_tables;
Empty set (0 sec)
drizzle> select * from data_dictionary.query_cache_entries;
Empty set (0 sec)
Conclusion:
That was a small kinda manual to get you started with the query cache plugin, I hope to extend the functionalities to support fine grained invalidation in the near future. But for now there is still many work to be done to get rid of all the withstanding problems.
Drizzle QueryCache plugin [Plugin Interface]
It has been a while since I didn’t wrote on my GS0C project “Query Cache”, though many advancements have been done. So far I can say that I have a much clearer idea of the query processing flow and the different data structures used. I have been able to write a first plugin that can actually cache a query into memcached and keep some metadata on a local map object. Another cool option is to use data dictionary views to check int real time the cache content (thanks to the very easy interface). The next step from here is to write the retrieval functionality, this is probably the most difficult part for me cuz I still need to understand the functioning of the client interface.
Before making any further step, here is wiki I have posted on the of drizzle’s wiki page. It is basically an explanation of the query cache plugin interface and the hooks, this has yet to be approved and merged into drizzle trunk.
Principle of the Query Cache
The Query Cache permits to return the results of a query from a cache repository if the query has been already cached; or, to add its results to a cache repository if it’s not cached yet. Therefore the simplest capability that a Query Cache plugin interface needs to provide is the ability to say if the received query is cached or not.
Then:
If the query is cached:
- Retrieve the resultset from the cache and skip the query parsing/processing
If the query is not cached:
- Initiate the Query Cache plugin, where the specific plugin will:
- Check if the query is cacheable
- Reserve a temp area and return a pointer to the session
- Build the hash key
- etc
- Add the results to the current temp area.
- Push the the temp variable and the respective cache key to the cache.
The Query Cache plugin Interface design
In the current design we have introduced the following functions to the interface (drizzled/plugin/query_cache.h)
static bool isCached(Session *session);
This method is responsible to probe the used cache system for an entry corresponding to the received query.
”Note: The plugin is responsible to check if the query is a Select.”
Static bool sendCachedResultset(Session *session);
This method is called directly if the query is cached and skip the lexical and query processing.
static bool prepareResultset(Session *session);
This method is called before the query handling (see: handle_select() method) in order to prepare the resultset that will receive the generated data. An opened session will have its own resultset variable and current query key, the plugin will have to initiate these two variables added to the Class Session.
static bool insertRecord(Session *session, List<Item> &item);
The result rows generated will ultimately go though the client interface, the query cache system intercept these calls and populate the resultset with the data being send to the client (see: select_send() method)
static bool setResultset(Session *session);
Once the query processing is terminated this method will send the resultset variable to the cache repository and finalize the environment variables.
The query resultset being stored and populated in the session is of type Google Protobuf:
drizzled/message/resultset.proto
Plugin Hooks
The following branch contains the plugin interface hooks in the kernel drizzle:
[lp:~dedzone/drizzle/query-cache-hook/changes]
The query interception
Upon reception of a query, (sql_parse.cc:717) in the mysql_parse method, and before even proceeding to the lexical and query processing, a call to ”isCached(session)” is made, a session pointer is passed for the query text, schema, and may be other parameters (for future use).
If isCached returns ”’true”’ then we’ll try to retrieve the results from the cache with ”sendCachedResultset”. If the retrieving operation succeeds (returns false) then we skip the remaining steps.
In the contrary – ie:
- If fetching the results fails or
- If the query is not cached
Then => proceed normally with the query parsing and processing
void mysql_parse(Session *session, const char *inBuf, uint32_t length)
{
lex_start(session);
session->reset_for_next_command();
/* Check if the Query is Cached if and return true if yes
* TODO the plugin has to make sure that the query is cacheble
* by setting the query_safe_cache param to TRUE
*/
bool res= true;
if (plugin::QueryCache::isCached(session))
{
res= plugin::QueryCache::sendCachedResultset(session);
}
if (not res)
{
#if defined(DEBUG)
errmsg_printf(ERRMSG_LVL_DBUG,_("Results retrieved from cache"));
#endif /* DEBUG */
return;
}
LEX *lex= session->lex;
Lex_input_stream lip(session, inBuf, length);
bool err= parse_sql(session, &lip);
........
........
lex->unit.cleanup();
session->set_proc_info("freeing items");
session->end_statement();
session->cleanup_after_query();
}
The Cache preparation and reception of the Resulset
In our journey executing a query, if the query is a Select will land in the (sql_parse.cc:494) execute_sqlcom_select method, this is basically the entry point of the query processing because it prepares to launch the handle_select method. before making this call will insert our prepareResultset hook, which will instruct the plugin implementation to:
- Prepare a cache to receive the resultset, and link it to the session::resultset pointer.
- create a hash key for our query, and link it to the session::query_cache_key member.
At this point the Session will be ‘aware’ that there is a plugin trying to cache the results of the query.
Immediatly after executing the handle_select, where all the resultset were generated, send to the client and added to the current resultset (As we shall see in the next session), we make a setResultset call, this will:
- Push the resultset to the cache area initiated by the plugin
- Reset the session members, resultset and query_cache_key.
bool execute_sqlcom_select(Session *session, TableList *all_tables)
{
LEX *lex= session->lex;
select_result *result=lex->result;
bool res= false;
.....
.....
if (!result && !(result= new select_send()))
return true;
/* Init the Query Cache plugin */
plugin::QueryCache::prepareResultset(session);
res= handle_select(session, lex, result, 0);
/* Send the Resultset to the cache */
plugin::QueryCache::setResultset(session);
if (result != lex->result)
delete result;
}
}
return res;
}
Populating the resultset
The idea behind adding data to the resultset is to intercept the send_data() method in (select_send.h:87) When sending a row to the client interface. So the hook will just pass the row (items) to the query cache plugin using insertResultset(session, items).
/* Send data to client. Returns 0 if ok */
bool send_data(List<Item> &items)
{
.......
.......
while ((item=li++))
{
if (item->send(session->client, &buffer))
{
my_message(ER_OUT_OF_RESOURCES, ER(ER_OUT_OF_RESOURCES), MYF(0));
break;
}
}
/* Insert this record to the Resultset into the cache */
if (session->query_cache_key != "" && session->getResultsetMessage() != NULL)
plugin::QueryCache::insertRecord(session, items);
session->sent_row_count++;
......
}
