MariaDB Parallel Replication

MariaDB Parallel Replication

Official document:

Starting from version 10.0.5, MariaDB began to support parallel replication

MariaDB 10.0's slave server can execute query and copy operations in parallel. This article will explain how it is implemented and the tuning you can do.

Note: The MariaDB version on the master-slave server must be 10.0.5 and later versions 10.0.5 to enable parallel replication

Parallel replication overview - Parallel replication overview

MariaDB replication is completed in three steps:

1. Read the change event from the service IO thread to the main library, and put the read event sequence in the relay log

2. Read one event in the relay log at a time from the SQL thread of the server

3. The SQL thread sequentially applies the events in the relay log

In versions prior to MariaDB 10, the third step was executed through SQL threads, which meant that only one event could be executed at a time, and replication was essentially single-threaded.

In versions after MariaDB 10, the third step can be through a set of independent replication threads, because multiple events can be replicated in parallel at a time, which improves replication performance.

How to enable parallel slave - How to enable parallel replication

Specify slave-parallel-threads = n in my.cnf as a parameter to pass to MySQL.

Parallel replication can also be used for multi-source replication connection setting "@@connection_name.slave-parallel-mode" is set to none, that is, set @@connection_name.slave-parallel-mode = none

The value of slave_parallel_threads determines how many worker threads in the pool will be created to execute parallel replication events on the slave server

If the value is 0, the old replication method is used by default (that is, the slave SQL thread replays the event event)

Generally speaking, if the value is non-zero, it should be set to at least twice the number of connections to the multi-source primary server. It does not make much sense to use one worker thread for a connection, because it will not only increase the burden of internal thread communication (that is, the communication between SQL threads and worker threads), but also when a connection uses one worker thread, events cannot be replicated in parallel.

The value of slave_parallel_threads can be dynamically adjusted, that is, it can be changed without restarting MySQL, but when it is changed, all slave library connections must be stopped.

What can be run in parallel - How to run parallel replication

What are the ways to replicate in parallel?

In-order ordered and out-of-order unordered two ways

Orderly way

Transactions are executed in parallel, but the commit order is sorted to ensure that the transaction commit on the slave database is consistent with the transaction commit sequence on the main database. Only those transactions that are automatically confirmed as not causing conflicts will be executed in parallel, that is, parallel replication is completely transparent to the application.

Disorderly way

Disorder means that the order of executing and committing transactions on the slave database may be inconsistent with the master database. In other words, the application must be able to tolerate the inconsistency of the data update sequence on the master and slave servers, and this method requires the application to ensure that the transaction There is no conflict between. The unordered method is only used when the GTID mode and application explicitly specify the use of disorder, and the replication domain is part of gtid.

Conservative mode of in-order parallel - optimistic

Optimistic mode is supported from MariaDB 10.1.3 version.

This model provides a large number of parallel application slaves, while still retaining precise transaction semantics from the application point of view.

Enable the use of the configuration option --slave-parallel-mode=optimistic.

Any transaction DML (insert/update/delete) can run in parallel, up to the limit @@slave_domain_parallel_threads. This may cause a conflict in the slave. If two transactions try to modify the same row, any such conflict is detected, and the latter of the two transactions is rolled back, allowing the former to continue. Once the former has been completed, the latter's transaction is retried.

Optimistic mode is suitable for this approach, because the server will have some conflicting optimistic assumptions. This extra work is spent on the rollback and re-transaction conflicts to obtain reasonable parallelism from running most of the transactions.

There are some enlightenments to try to avoid unnecessary conflicts if a row lock wait is executed on the master. It will no longer run slaves in parallel,

Transactions can also be clearly identified as potential conflicts on the master by setting the variable @@skip_parallel_replication. This heuristic may be added to the MariaDB version in the future (not yet supported)

This is the next option--slave-parallel-mode called aggressive "aggressive" mode. When these heuristics are disabled, more transactions are allowed to be applied in parallel.

Non-transactional DML and DDL are not safe for optimistic applications in parallel, because it cannot be rolled back in case of conflict. Therefore, in an optimistic mode, non-transactional (such as MyISAM) updates are not suitable for parallel or early events. (However, it may be applicable to paralleling a MyISAM update after it is updated), DDL statements are not applicable to any other transactions, sooner or later.

Different transaction types can be determined in mysqlbinlog output, for example:

#150324 13:06:26 server id 1  end_log_pos 6881 GTID 0-1-42 ddl


#150324 13:06:26 server id 1  end_log_pos 7816 GTID 0-1-47


#150324 13:06:26 server id 1  end_log_pos 8177  GTID 0-1-49 trans

/*!100101 SET @@session.skip_parallel_replication=1*//*!*/;


#150324 13:06:26 server id 1  end_log_pos 9836 GTID 0-1-59 trans waited

gtid 0-1-42 indicates DDL. gtid 0-1-47 are marked as non-transactional DML. And gtid 0-1-49 is transactional DML ("trans" keyword). In addition, gtid 0-1-49 runs @@ skip_parallel_replication on the master, and gtid 0-1-59 is transactional, on the master There is a row waiting for DML locks. ("waited" keyword).

Conservative mode of in-order parallel - conservative mode of parallel replication (conservative)

The default is conservative mode. The only mode available in version 10.0 can be turned on with --slave-parallel-mode=conservative.

In the conservative mode, parallel replication uses "group submission" on the master to discover potential slaves for parallel application events.

If two things are submitted in a group on the mater, they are all written into the same commit id, and the binlogs of such events will not conflict with each other, and they can be scheduled to be copied to run in different threads.

Two transactions submitted separately on the master may conflict (for example, modify a row of the same table). Therefore, the worker applying the second transaction will not start immediately.

But after the first transaction has started the commit step, at this point, it is safe to start the second transaction because it can no longer disrupt the execution of the first one.

This is an example of mysqlbinlog output, showing that gtid events are marked with commit id, gtid 0-1-47 does not have a commit id, and cannot be run in parallel.

gtid 0-1-48 and 0-1-49 have the same commit id 630, which can be replicated in parallel on another slave.

The binlog information is as follows:

#150324 12:54:24 server id 1  end_log_pos 20052 GTID 0-1-47 trans


#150324 12:54:24 server id 1  end_log_pos 20212 GTID 0-1-48 cid=630 trans


#150324 12:54:24 server id 1  end_log_pos 20372 GTID 0-1-49 cid=630 trans

In either case, when two transactions reach the low-level commit point and the commit order is determined, the two commits occur on the mater in the same order, so that the operation is transparent to the application.

If more transactions are committed in a group commit, the chance of parallel replication for slaves can be highly increased.

This can be adjusted using the binlog_commit_wait_count and binlog_commit_wait_usec variables. For example, if the application can tolerate an additional delay of 50 milliseconds in the master, you can set binlog_commit_wait_usec = 50000 and binlog_commit_wait_count = 20. In an available time, parallel replication can obtain up to 20 transaction processing.

But be careful not to set binlog_commit_wait_usec too high, because this may cause the application to slow down and run a large number of serial small transactions one after another.

Please note: Even if there is no parallelism of group submission on the master, there is still an opportunity to increase the parallel speed.

Because the actual commit steps of different transactions can be run. Especially effective on slave, binlog is enabled (log_slave_updates=1)

Even if the slave is configured for crash repair security (sync_binlog=1 and innodb_flush_log_at_trx_commit=1), this group submission may be on the slave

In --slave-parallel-mode=minimal mode, only the transaction commit step is applied in parallel. All other transactional replication occurs continuously.

Out-of-order parallel - Out-of-order parallel replication

Unordered parallel replication is only used in GTID mode. When different replication domain GTIDs are used, the replication domain is the variable gtid_domain_id used in the DBA/application.

In parallel replication, there are two transaction gtids. Different domain_ids are scheduled on different threads, and execution is allowed to be completely independent of this. It is the responsibility of the application to set different domain_ids only for those truly independent transactions.

It is guaranteed that there will be no conflicts with each other, and the application must work normally, even if there are different domain_id in the transaction, it is regarded as a different order, between slave and master, or in different slaves.

Because the application can explicitly provide more opportunities to run transactions in parallel. Rather than the server can automatically determine, due to the order of parallel replication of transactions, parallel replication may automatically determine itself

A simple but effective usage is to run long-running statements in a separate replication domain, such as "change tables". This allows replication of other transactions to continue uninterrupted:

SET SESSION gtid_domain_id=1


SET SESSION gtid_domain_id=0

Usually, a long-running ALTER TABLE statement or other query will stop all subsequent transactions. It will cause the slave to lag behind the master at least a long time to execute this long-running query.

Through out-of-order parallel replication, set the replication domain id. It can be avoided. When running ALTER TABLE, the DBA/application must ensure that no conflicting transactions will be replicated.

Another common condition for out-of-order parallel replication is related to multi-source replication. Suppose we have two different masters M1 and M2, we use multi-source replication S1 as a slave of M1 and M2, S1 will receive events from M2 and M1 will receive events.

If we now have one-third of the slaves and copy the master from S1 to S2, we hope that S2 can also apply the event originated from M1 and M2 parallel event originated from. This can be achieved from sequential parallel replication by setting the gtid_domain_id in M1 and M2 to be different.

Please note that there are no special restrictions, you can use unordered parallel replication, such operations can be in the same database/schema, or even in the same table, the only restriction is that the operations must not conflict, which is that they must be able to Apply in any order, and finally get the same result.

When unordered parallel replication is used,

The current slave position is the master position

In the master binlog at any one time, it becomes multi-dimensional-each replication domain can reach a different point,

The current position can be seen in the gtid_slave_pos variable. When the slave is stopped, restarted, or switched replication to use CHANGE MASTER from a different master, MariaDB automatically handles restarting each replication domain at the appropriate point in the binlog.

When --slave-parallel-mode=minimal (or none) mode, unordered parallel replication is turned off.

Checking worker thread status in SHOW PROCESSLIST - Check the thread status of the display list

In the show processlist, the worker threads will list "system users" and their status will show the queries they are currently making, or it can show one of them:

"Waiting for work from main SQL threads". "Waiting for work from main SQL threads".

This means that the worker thread is idle and there is no work available for it at the moment.

"Waiting for prior transaction to start commit before starting next transaction". "Waiting for prior transaction to start commit before starting next transaction".

This means that the first batch of transactions submitted by the master must be completed first. This worker thread can start working in the next batch before waiting for it to happen.

This worker thread is waiting for that to happen before it can start working on the following batch.

"Waiting for prior transaction to commit". "Waiting for prior transaction to commit".

This means that the transaction has been executed by the worker thread. In order to ensure that the order is submitted, the worker thread waits for submission, knowing that the last transaction is ready to be submitted before it.

Expected performance gain - An article about performance expectations testing

Here is an article about the improved performance up to 10 times when using parallel replication

Article address:

Slave-parallel-max-queued configuration parameters:

The variable slave_parallel_max_queued is the only meaningful, provided that the parallel replication technology is used (when the slave_parallel_threads parameter is greater than 0), when parallel replication is used, the SQL thread will pre-read the event in the relay-log, and the queue time will be stored in the memory in parallel While copying, look for opportunities to execute events in parallel. The @@slave_parallel_max_queued variable is set to how many memory-limited logs the SQL thread will read ahead, looking for such opportunities. Each thread is limited, so the value of pre-reading this time is worth setting the @@slave_parallel_threads variable

If this value is set too high and the slave is too far behind the master (such as gigabytes of binlog), then the SQL thread can read quickly, and binlog events that fill a large amount of memory will consume faster than worker threads

On the other hand, if the value is set too low, the SQL thread may not have enough space to keep enough event queue busy threads, which may reduce performance.

Note that @@slave_parallel_max_queued is not a hard limit, because the current execution of binlog events always needs to be stored in memory.

For example, at least two events per worker thread can always be queued in memory, no matter how many slave_parallel_threads are

Generally, slave_parallel_threads should be set large enough. The SQL thread can use all possible parallelism to read binlog far enough. In normal operation, the slave will not want to fall too far behind, so it will not put a large amount of data in the queue as necessary. In memory. So slave_parallel_threads may be set quite high (such as several hundred KB) without limiting the throughput. It should be set low enough that slave_parallel_threads * slave_parallel_max_queued will not cause insufficient server memory.

slave_domain_parallel_threads configuration parameters:

The pool of replication worker threads is shared among all multi-source primary connections and among all replication domains that can be used in parallel for sequential replication.

If a main connection or a replication domain is currently executing a long-running query, it may allocate all the worker threads in the pool and can only wait for the completion of the long-running query.

Delay any master main library connection or replication domain, which will have to wait for the replication domain worker thread to become idle.

The value set by the slave_domain_parallel_threads variable can be used to avoid being lower than the value set by slave_parallel_threads.

When the setting is not equal to 0, each replication domain in a primary connection can retain multiple threads for at most one time period.

The rest reach the value of the slave_parallel_threads parameter, and there is no other primary connection or replication domain to use in parallel replication.

The slave_domain_parallel_threads variable can be modified dynamically, and can be changed without restarting the service. All slaves must stop making changes.

Reprinted at:

Reference :