Bobcares

AWS RDS Read Replica lag

by | Oct 30, 2021

AWS RDS Read Replica lag occurs because it uses asynchronous replication and cannot keep up with the primary DB instance

Here, at Bobcares, we assist our customers with several AWS queries as part of our AWS Support Services.

Today let us find the cause of replica lag when using Amazon RDS for MySQL.

 

AWS RDS Read Replica lag

With Amazon RDS for MySQL read replica with binary log file position-based replication, we can monitor replication lag.

Generally, MySQL replication works with three threads: the Binlog Dump thread, the IO_THREAD, and the SQL_THREAD.

If there is a delay in replication, we can check if the replica IO_THREAD or the replica SQL_THREAD causes the lag.

Then, we can identify the root cause of the lag.

To identify the replication thread that lags, our Support Techs shows the following examples.

1. Initially, on the primary DB instance, we run the following command and review the output:

mysql> SHOW MASTER STATUS;
+----------------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+----------------------------+----------+--------------+------------------+-------------------+
| mysql-bin-changelog.066552| 521 | | | |
+----------------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

2. Then we run the SHOW SLAVE STATUS command on the replica DB instance and review the output:

Example 1:

mysql> SHOW SLAVE STATUS\G;
*************************** 1. row ***************************
Master_Log_File: mysql-bin.066548
Read_Master_Log_Pos: 10050480
Relay_Master_Log_File: mysql-bin.066548
Exec_Master_Log_Pos: 10050300
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

Example 2:

mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Master_Log_File: mysql-bin.066552
Read_Master_Log_Pos: 430
Relay_Master_Log_File: mysql-bin.066530
Exec_Master_Log_Pos: 50360
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

Normally, IO_THREAD doesn’t cause large replication delays.

However, if the replica SQL_THREAD is the source of replication delays, the cause could be the following:

  • Long-running queries on the primary DB instance

Long-running queries that take an equal amount of time to run on the replica DB instance can increase seconds_behind_master.

For instance, say we initiate a change. If it takes an hour to run then the lag is one hour. Since the change takes the same time to complete on the replica, the total lag is approximately two hours.

This is a common delay. However, if we monitor the slow query log on the primary instance we can minimize this lag.

In addition, we can identify and break long-running statements to reduce lag.

  • Insufficient DB instance class size or storage

If the storage configuration or the replica DB instance class is lower than the primary, then the replica might struggle with insufficient resources.

So, we need to ensure that the DB instance type of the replica is the same or higher than the primary DB instance.

  • Parallel queries run on the primary DB instance

When a high volume of writes occurs to the source DB instance in parallel, the writes to the read replica are serialized using a single SQL_THREAD.

This can create a lag between them both.

  • Binary logs synced to the disk on the replica DB instance

Enabling automatic backups results in overhead to sync the binary logs to the disk on the replica.

By default, the value of the parameter sync_binlog is set to 1. Changing this value to 0 disables the synchronization of the binary log to the disk by the MySQL server.

  • Binlog_format is set to ROW

There can be instances when the binlog_format is set to ROW. The source table may also miss a primary key.

To resolve this in the short term, we can change the search algorithm to INDEX_SCAN,HASH_SCAN.

For the long-term resolution, our Support Techs recommend adding an explicit primary key to each table.

  • Replica creation lag

Amazon RDS creates a read replica of a MySQL primary instance by taking a DB snapshot. Then restores it to create a new DB instance (replica) and establishes replication between the two.

Most often, when a replication establishes, there is a lag to create a backup of the primary.

To minimize this lag, before we call for the replica action we can create a manual backup. Thus, the snapshot taken by the replica creation process is an incremental backup, which is faster.

We can also enable the InnoDB cache warming feature. This will provide performance gains by saving the current state of the buffer pool.

[Found it useful? Here is a lot more to explore]

 

Conclusion

In short, we saw how our Support Techs go about AWS RDS Read Replica lag.

PREVENT YOUR SERVER FROM CRASHING!

Never again lose customers to poor server speed! Let us help you.

Our server experts will monitor & maintain your server 24/7 so that it remains lightning fast and secure.

GET STARTED

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Never again lose customers to poor
server speed! Let us help you.

Privacy Preference Center

Necessary

Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.

PHPSESSID - Preserves user session state across page requests.

gdpr[consent_types] - Used to store user consents.

gdpr[allowed_cookies] - Used to store user allowed cookies.

PHPSESSID, gdpr[consent_types], gdpr[allowed_cookies]
PHPSESSID
WHMCSpKDlPzh2chML

Statistics

Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.

_ga - Preserves user session state across page requests.

_gat - Used by Google Analytics to throttle request rate

_gid - Registers a unique ID that is used to generate statistical data on how you use the website.

smartlookCookie - Used to collect user device and location information of the site visitors to improve the websites User Experience.

_ga, _gat, _gid
_ga, _gat, _gid
smartlookCookie
_clck, _clsk, CLID, ANONCHK, MR, MUID, SM

Marketing

Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.

IDE - Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.

test_cookie - Used to check if the user's browser supports cookies.

1P_JAR - Google cookie. These cookies are used to collect website statistics and track conversion rates.

NID - Registers a unique ID that identifies a returning user's device. The ID is used for serving ads that are most relevant to the user.

DV - Google ad personalisation

_reb2bgeo - The visitor's geographical location

_reb2bloaded - Whether or not the script loaded for the visitor

_reb2bref - The referring URL for the visit

_reb2bsessionID - The visitor's RB2B session ID

_reb2buid - The visitor's RB2B user ID

IDE, test_cookie, 1P_JAR, NID, DV, NID
IDE, test_cookie
1P_JAR, NID, DV
NID
hblid
_reb2bgeo, _reb2bloaded, _reb2bref, _reb2bsessionID, _reb2buid

Security

These are essential site cookies, used by the google reCAPTCHA. These cookies use an unique identifier to verify if a visitor is human or a bot.

SID, APISID, HSID, NID, PREF
SID, APISID, HSID, NID, PREF