Bobcares

AWS DMS ERROR: Cancelling statement due to statement timeout

by | Aug 3, 2021

AWS DMS ERROR: Cancelling statement due to statement timeout occurs when we migrate data to or from the on-premises PostgreSQL database using AWS DMS.

Here, at Bobcares, we assist our customers with several AWS queries as part of our AWS Support Services.

Today, let us see how to troubleshoot and resolve these errors.

 

AWS DMS ERROR: Cancelling statement due to statement timeout

AWS DMS uses the default timeout setting of 60 seconds while executing commands to either get data from the source or put data in the target.

Suppose, the source or target is heavily loaded or there are locks in the tables. Then it can’t finish executing within 60 seconds.

As a result, the task fails with an error and we see the following entry in the log:

“]E: wal_slot_create(…) – Unable to create slot ‘xxxxxxxxxxxxxxxx_00016391_c4a70947_84c9_4a55_8d54_ff63f2f69a52’ (on execute(…) phase) [1020101] (postgres_endpoint_wal_utils.c:3215)”

 

How to Troubleshoot and Resolve?

To troubleshoot and resolve this, our Support Techs recommend the steps below.

  • Identify the cause of long execution times for commands

Initially, we review the AWS DMS task log and the table statistics section of the task to find the command that failed.

In addition, we can find it in the PostgreSQL error log file if the parameter log_min_error_statement is set to ERROR or a lower severity.

Once we identify the command, we can find the failed table names.

For instance, the below is an example error message from the PostgreSQL error log:

ERROR: canceling statement due to statement timeout
STATEMENT: <The statement executed>"

Then, to find locks on the associated tables, we run the following command in the source or target:

SELECT blocked_locks.pid AS blocked_pid,
blocked_activity.usename AS blocked_user,
blocking_locks.pid AS blocking_pid,
blocking_activity.usename AS blocking_user, 
blocked_activity.query AS blocked_statement,
blocking_activity.query AS current_statement_in_blocking_process
FROM pg_catalog.pg_locks blocked_locks 
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks 
ON blocking_locks.locktype = blocked_locks.locktype 
AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE
AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
AND blocking_locks.pid != blocked_locks.pid 
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.GRANTED;

Suppose we found PIDs that are blocked. Then we need to stop or “kill” it using:

SELECT pg_terminate_backend(blocking_pid);

Since the dead rows or “tuples” can increase SELECT time, we check for large numbers of dead rows in the source tables:

select * from pg_stat_user_tables where relname= 'table_name';

Later we check to see whether or not the failed target table has primary keys or unique indexes.

  • Increase the timeout value

As we know, the default value for executeTimeout is 60 seconds. Let us now see how our Support Techs increase this value.

  1. Initially, we open the AWS DMS console.
  2. Then from the navigation pane, we select Endpoints > PostgreSQL endpoint > Actions > Modify.
  3. We expand the Endpoint-specific settings section.
  4. In the field for Extra connection attributes, we enter:
    executeTimeout=3600;
  5. Then we select, Save.
  6. From the Endpoints pane, we select the name of the PostgreSQL endpoint.
  7. From the Connections section, the Status of the endpoint will change from Testing to Successful.
  • Troubleshoot slot creation issues

Suppose the timeout occurs when we create the replication slot in the PostgreSQL database. Then we will see log entries similar to the following:

“]E: wal_slot_create(…) – Unable to create slot ‘xxxxxxxxxxxxxxxx_00016391_c4a70947_84c9_4a55_8d54_ff63f2f69a52’ (on execute(…) phase) [1020101] (postgres_endpoint_wal_utils.c:3215)”

To resolve this issue,  we use version 3.1.4 for which the default timeout for this command is 600 seconds.

We can also increase this timeout. To do so, we need to configure the TransactionConsistencyTimeout parameter in the Task settings section.

If there are any active locks in the database user tables, PostgreSQL can’t create the replication slot.

To check for locks, we run:

select * from pg_locks;

Then, we test whether the error has been resolved.

To do so, we run the following command to manually create the replication slot in the source PostgreSQL database:

select xlog_position FROM pg_create_logical_replication_slot('<Slot name as per
the task log>', 'test_decoding');

If the command is successful, we delete the test slot:

select pg_drop_replication_slot(‘<slot name>');

Finally, we restart the migration task.

If the command still fails, then we need to work with a PostgreSQL DBA to identify and configure the database.

[Need help with the AWS DMS ERROR? We’d be happy to assist]

 

Conclusion

In short, we saw how our Support Techs fix this error for our customers.

PREVENT YOUR SERVER FROM CRASHING!

Never again lose customers to poor server speed! Let us help you.

Our server experts will monitor & maintain your server 24/7 so that it remains lightning fast and secure.

GET STARTED

var google_conversion_label = "owonCMyG5nEQ0aD71QM";

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Never again lose customers to poor
server speed! Let us help you.