SQL Master: Part 1 – Divide a job

This is one of few posts in which we will go trough problematics of writing SQL queries for complex high

This is one of few posts in which we will go trough problematics of writing SQL queries for complex high user rate systems on MySQL DBMS

To write a good query one must first think about system on which it is going to work.
Let’s start simple. If we want to extract data from multiple tables it is often recommended to use JOIN statement so that data could be fetched in one query. This is very good practice for smaler DBs.

But if you have multiple tables with great amount of data it is often question how to write good query.

Problem

For example:
You need to extract data from 8 tables sort them by timestamp descending and limit the result to 25 rows. This can all be done in single query. In that way you will give all job to DBMS and the rest of your system will be “free”. That kind of query would have about 1000 lines. It will do all the job and application should then just present data (on site or trough some service). And this is “great”. Your system is free (except DBMS), your app is not complex and it all works good.

That is the greatest mistake you could do.

Let we explain why. First reason would be bad resource usage. While your DBMS is working at full rest of your system is practically doing nothing. Second reason is concerned about amount of data. On smaller system with low user rate and small DB one query could do the job very good. But if you have a big system with high user rate and big amount of data one query could do more damage than good. When we tested this on system with higher user rate one query never ended. Actually not that it never ended but the DBMS crashed.

With proper job disposal you could get a thousand time faster response on high user rate and big amount of data

Solution

And we started again. Now that we knew that giving all the job to DBMS is bad, but also making a lot of queries to DBMS is also bad we had to find a compromis to make it working. So we let DBMS to do some job and application to do the rest. Result was 3 queries with little over 250 lines all together. Of course complexity of an application increases but the complexity of a query decreases. In paralel testing of these solutions second solution on higher user rate was 1000 time faster than the first one.

Now you can see the importance of proper job disposal.

That would be all from us in this intro post. Next time we will show you some ideas how to collect as less as possible data from system to make things go.

Leave a Reply

Your email address will not be published. Required fields are marked *