Learn MySQL Internals and Query Performance Optimization Techniques

Learn MySQL Internals and Query Performance Optimization Techniques

Over the course of the past few months, I have been publishing articles summarizing a really nice book called High Performance MySQL. I have only posted about the first 7 chapters which I think are the most essential ones for anyone learning relational databases and especially MySQL.

In this article, I will give:

  • A quick overview of the book.
  • An overview of what each chapter will teach you.
  • My review of the book.

Quick Overview

High performance MySQL takes you on a journey of learning advanced techniques for everything from designing schemas, indexes, and queries to tuning your MySQL server, operating system, and hardware to their fullest potential.

The book goes over a lot of concepts when it comes to database internals as you will get to learn a lot about how different components are designed which will give you a pretty good introduction that's very important to understanding any discussed optimization techniques.

Chapters Overview

The series I have published discusses only the first 7 chapters which are in my opinion that most essentials ones in the book. It goes as follows:

1. MySQL Architecture and History

This chapter provides a high-level overview of the MySQL server architecture. It starts by a discussion of MySQL's logical server architecture, it proceeds to discuss topics such as connection management, query optimization and execution, concurrency control, transactions, isolation levels and finally a discussion of the different MySQL storage engines. Many of those topics have dedicated chapters to them so they will only be discussed briefly in this chapter.

2. Benchmarking MySQL

This chapter teaches you how to methodically benchmark your MySQL instance performance. It starts by a discussion of what benchmarking means, the different possible benchmarking strategies and the available benchmarking tools that you can utilize.

3. Profiling Server Performance

This chapter teaches how to profile your MySQL server performance, it starts by a discussion of profiling means, then it moves to discuss topics like application instrumentation, profiling the server workload, profiling single queries, useful tips on how to profile and detect slow queries and it ends with a detailed case study of how profiling can help detect server problems.

4. Optimizing Schema and Data Types

This chapters teaches you schema design for good performance. The chapter starts with a discussion of available data types in MySQL and how to choose the optimal data type for your use case, then it goes into a discussion of some choices when it comes to the logical design of a schema such as normalized and denormalized schema designs, then it discusses how to mix both design approaches to speed up queries by using techniques such as cache and summary tables and since no schema stays the same, the chapter ends with some advice on the use of ALTER TABLE statements.

5. Indexing for High Performance

This is, in my opinion, the most important chapter in the series because designing a good and an efficient schema is not the whole story. Having indexes is critical for good performance specially as the database grows. Indexing optimization is the most powerful way to improve query performance as choosing the correct indexes can improve performance by orders of magnitude in many cases, so it's essential to have a good understanding of indexes and how they work and how to employ them to your benefit.

This chapter starts by discussing the basics of indexes, their types, their internal structure and their benefits. Then it goes on to discuss indexing strategies, such as, choosing the indexed columns, choosing the columns order, clustered indexes, covering indexes and how performance is affected by redundant, duplicate and unused indexes. It ends with a case study on indexes that puts everything together in a toy project.

6.1 Query Performance Optimization| Part 1

This chapter starts by some general considerations for query performance, a discussion of slow queries and the things to consider first when a query is not performing as well as expected. It then discuss data access types and how to improve the access type for queries in order to optimize them. It then discusses some of the most common mistakes that cause performance degradation and it ends with some suggestions on how to restructure queries and re-write them to get around those mistakes.

6.2 Query Performance Optimization| Part 2

This part of the chapter discusses the basics of query execution as it introduces components like the pre-processor, the query parser and the query optimizer. It then goes into an extended discussion of the query optimizer, gives examples of the different types of optimizations that it performs and it ends with a discussion of the MySQL execution plan.

7. Advanced MySQL Features

This chapter goes over some of the advanced features that MySQL has to offer and see how we can leverage such features to improve performance even more. Some of the features discussed throughout this chapter include: partitioned tables, views, triggers, events, etc.

Book Review

In short, I really enjoyed reading this book and it actually taught me a lot of stuff that I started using day to day. This book is not really beginner friendly, so if it's your first time learning about databases, this book is probably not for you because it gets really detailed really fast. In my opinion, this book is perfect for those who has already learned about relational databases and are comfortable writing SQL (your typical databases 101 level).

It being of an intermediate difficulty, it takes you from being a database beginner and solidifies your knowledge and adds a lot of new idea to your toolbox. I think it's essential for every software engineer working with databases to have the knowledge outlined in the first few chapters as it will really help in making more informed decisions when it comes to database and query design which is an important part of every software engineer's workflow.

One thing to note though, is that the book is somewhat old, even the latest edition is old. So you should be really careful when it discuss MySQL specifics and be sure to always refer to the latest docs.

Overall, it's a very nice read that I highly recommend. The series I have published summarizes the book for those who don't have a lot of time at hand. I think it would be enough to read it in most cases. It would be a good idea to check the case study at the end of every chapter (sometimes they are not part of the published articles to keep them short).

I hope you will enjoy it!