This article gives some general guidelines on performance tuning. It can be a huge subject, but there are some basics to consider that will take you a very long way. The last paragraph covers MySQL Cluster (ndbcluster) specifics, and the first part covers innodb engine specifics.
The thing that affects performance the most (in order) is:
- Queries + Schema --> nothing can save you from bad queries and wrong indexes.
Information about this you can find in the Query Monitor.
Do you have full table scans?
Are the correct indexes used?
Do you read (rows examined) many rows, but none or only one or two is actually sent to the application (rows_sent) ?
If Yes on any of the above you should add indexes or potentially try to express the query differently.
- innodb_buffer_pool --> it is a cache, so ideally the data you need most frequently should be there. E.g, you may have a 100GB database, but 90GB consists of a history logs, which aren't needed to be queried all the time (perhaps for reporting once a week or something). In that case it does not make sense to have a buffer pool of 100GB just to be able to very quickly generate the report once a week, but rather have perhaps 8-16GB (the latter to accomodate data growth) to account for the _active_ data set, that is used all the time . In the Health Report, look at the Innodb Buffer Pool Hit Ratio. It should be 998/1000 or more.
The impact of a too small innodb buffer pool is that you will end up with having more disk seeks and reads to bring your data up to the application. Disk i/o is a lot slower than fetching data from a cache (buffer pool). Disk reads can be really slow if you don't have a fast disk sub system, like SSD.
- For larger data sets ( data set >> server RAM ), then the Innodb Redo log can be increased (it is not so easy to do a Galera Cluster.. more about this later), and then if you have a large amount of System Memory, and have an innodb_buffer_pool > 8GB, it may make sense to set innodb_buffer_pool_instances to 4 or 8. This will reduce mutex contention inside the MySQL Server (innodb engine) and speed up queries. Especially, if you see "spiky" performance in the graphs for Select, Update, Insert, Deletes) then it may be because the buffer pool is flushed to disk. Increasing the redo log, and increasing the number of innodb_buffer_pool_instances will help you then.
But at the end of the day, Query and Schema optimization makes the most bang for the buck, ask the Facebook Db Team about this :) . If you have full table scans etc, thats going to bite you really hard and the application will never scale with the number of requests or users.
Table cache and thread cache are important too, check the Health Report page we generate. it is the most common rules.
Setting up a Test / QA Cluster mimicking the Production environment for your developers to try their code and queries on will save you a lot of time. Most often the queries are tested on the developer's laptop with a very limited data set. Anything will run fast then.
For MySQL Cluster, there are no parameters that will give you better performance, except:
- ndb_cluster_connection_pool (2x the number of cores on the mysql server is a good starting point)
- ndb_autoincrement_prefetch_sz (if you use autoincrements and inserts a lot of data)
- thread cache / table cache