Chapter 9: Scaling and High Availability
Scalability is the ability to add capacity as needed without reducing performance.
Scaling MySQL
Planning for Scalability
Buying Time Before Scaling
Optimize Performance:
If you’re facing performance limitations now, one of the first things you should do is enable and analyze the slow query log, and then see which queries you can optimize.
Buy more powerful hardware
Scaling Up
MySQL doesn’t tend to scale vertically very well, because it’s hard to get it to use many CPUs and disks effectively.
Even if your master server can use many CPUs effectively, there’s little chance that you’ll be able to build a slave server that’s powerful enough to keep up. A heavily loaded master can easily do more work than a slave server with the same hardware can handle, because the slave’s replication thread can’t use multiple CPUs and disks efficiently.
Single-server applications usually run into read limits first, especially if they run complicated read queries. Such queries are single-threaded inside MySQL, so they’ll use only one CPU.
Scaling Out
The simplest and most common way to scale out is to distribute your data across several servers with replication, and then use the slaves for read queries.
The other common way to scale out is to partition your workload across multiple “nodes”. A node is the functional unit in your MySQL architecture.
Functional partitioning
Functional partitioning, or division of duties, means dedicating different nodes to different tasks.
Functional partitioning usually dedicates individual servers or nodes to different applications, so each contains only the data its particular application needs.
Another possible functional partitioning approach is to split a single application’s data by determining sets of tables you never join to each other.
You still can’t scale functional partitioning indefinitely, because each functional area must scale vertically if it is tied to a single MySQL node.
Data sharding
Data sharding is the most common and successful approach for scaling today’s very large MySQL applications. You shard the data by splitting it into smaller pieces, or shards, and storing them on different nodes.
Sharding works well when combined with some type of functional partitioning. Most sharded systems also have some “global” data that isn’t sharded at all. This global data is usually stored on a single node, often behind a cache such as memcached.
Sharded applications often have a database abstraction library that eases the communication between the application and the sharded data store.
A sharded data store may feel like an elegant solution, but it’s hard to build.
Choosing a partitioning key
The most important challenge with sharding is finding and retrieving data. How you find data depends on how you shard it.
The goal is to make our most important and frequent queries touch as few shard as possible. The most important part of that process is choosing a partitioning key (or keys) for your data.
Multiple partitioning keys.
Querying across shards
Querying across shards isn’t the only thing that’s harder with sharding. Maintaining data consistency is also difficult. Foreign keys won’t work across shards, so the normal solution is to check as needed in the application.
Allocating data, shards, and nodes
Shards and nodes don’t have to have a one-to-one relationship. It’s often a good idea to make a shard’s size much smaller than a node’s capacity, so you can store multiple shards on a single node.
Fixed allocation
There are two main ways to allocate data to the shards: the fixed and dynamic allocation strategies. Both require a partitioning function that takes a row’s partitioning key as input and returns the shard that holds the row.
Fixed allocation uses a partitioning function that depends only on the partitioning key’s value.
Dynamic allocation
The alternative to fixed allocation is dynamic allocation that you store separately, as a per-unit-of-sharding mapping.
Explicit allocation
A third allocation strategy is to let the application choose each row’s desired shard explicitly when it creates the row.
Rebalancing shards
Generating globally unique IDs
- Use auto_increment_increment and auto_increment_offset
- Create a table in the global node
- Use memcached
There’s an incr() function in the memcached API that can increment a number atomically and return the result
- Allocate numbers in batches
- Use a combination of values
- Use dual-column AUTO_INCREMENT keys (only works for MyISAM tables)
- Use GUID values
Tools for sharding
One database abstraction layer with sharding support that exists already is Hiberate Shards. Hibernate Shards is a good way to store and retrieve data across many servers transparently, but it doesn’t provide some features, such as rebalancing shards and aggregating query results. It uses a fixed allocation strategy to allocate data to the shards.
Another sharding system is HiveDB, an open source framework for sharding MySQL that tries to implement sharding’s core ideas in a clear and concise way.
Scaling Back
One of the simpler ways to deal with an increasing data size and workload is to archive and purge unneeded data.
Keeping active data separate
Splitting tables into parts
MySQL partitioning
Time-based data partitioning
Scaling by Clustering
Clustering
MySQL’s NDB Cluster storage engine is a distributed, in-memory, shared-nothing storage engine with synchronous replication and automatic data partitioning across the nodes.
Federation
In the database world, Federation generally means accessing one server’s data from another server.
Load Balancing
Load balancing has 5 common goals:
- Scalability
- Efficiency
- Availability
- Transparency
- Consistency
Connecting Directly
Splitting reads and writes in replication
MySQL replication gives you multiple copies of your data and lets you choose whether to run a query on the master or a slave.
The primary difficulty is how to handle stale data on the slave.
Some of the most common methods of splitting reads and writes are as follows:
- Query-based split: The simplest split is to direct all writes and any reads that can never tolerate stale data to the active or master server
- Stale-data split:
- Session-based split: whether the user has changed any data. The user doesn’t have to see the most up-to-date data from other users but should see his or her own changes. You can implement this at the session level by flagging the session has made a change and directing the user’s read queries to the master for a certain period of time after that.
- Version-based split:
- Global version/session split
-
Changing the application configuration
Changing DNS names
Moving IP addresses
Introducing a Middleman
Load balances
Load-balancing algorithms
- Random
- Round-robin
- Least connections
- Fastest response
- Hashed
- Weighted
-
Adding and removing servers in the pool
You should configure the servers in the connection pool so that there is enough unused capacity to let you take servers out for maintenance, or to handle the load when servers fail.
Load Balancing with a Master and Multiple Slaves
- Functional partitioning
- Filtering and data partitioning
- Moving parts of writes to a slave
- Guaranteeing a slave is caught up
- Writes synchronization
High Availability
Planning for High Availability
The most important principle of high availability is to find and eliminate single points of failure in your system. Any point in your system that isn’t redundant is a single point of failure.
Adding Redundancy
Shared-storage architectures
Shared storage is a way to remove some single point of failure, usually with a SAN.
Replicated-disk architectures
The disk replication most commonly used for MySQL is DRBD, in combination with the tools from the Linux-HA project.
DRBD is synchronous, block-level replication implemented as a Linux kernel module. It copies every block from a primary device over a network card to another server’s block device, and writes it there before committing the block on the primary device.
Our favorite way to use DRBD is to replicate only the device that holds the binary logs. If the active node fails, you can start a log server on the passive node and use the recovered binary logs to bring all of the failed master’s slaves up to the latest binary log position.
Synchronous MySQL replication
Failover and Failback
Failover is the process of removing a failed server and using another server instead. The most important part of failover is failback. If you can’t switch back and forth between servers at will, failover is a dead end and only postpones downtime.
Promoting a slave or switching roles
Virtual IP addresses or IP takeover
You can achieve high availability by assigning a logical IP address to a MySQL instance that you expect to perform certain services. If the MySQL instance fails, you can move the IP address to a different MySQL server.
Floating IP addresses and IP takeover work well for failover between machines that are local to each other -- that is, on the same subnet.
The MySQL Master-Master Replication Manager
Middleman solutions
You can use proxies, port forwarding, network address translation(NAT), and hardware load balancers for failover and failback.
Handling failover in the application