Wondering how to classic resize on an Amazon Redshift cluster? We can help you.
At Bobcares we assist our customers with several AWS queries as part of our AWS Support Services for AWS users, and online service providers.
Today, let us see how our Support Techs assist with this Redshift query.
Classic resize on an Amazon Redshift cluster
Today, let us discuss about the three ways followed by our Support Techs to resize an Amazon Redshift cluster.
Elastic resize
If it’s available as an option, use elastic resize to change the node type, number of nodes, or both.
Please note that when you only change the number of nodes, the queries are temporarily paused and connections are kept open.
An elastic resize can take between 10-15 minutes.
During a resize operation, the cluster is read-only.
Elastic resize is the fastest method to resize an Amazon Redshift cluster.
Classic resize
Use classic resize to change the node type, number of nodes, or both.
Choose this option when you are resizing to a configuration that isn’t available through elastic resize.
With the classic resize operation, your data is copied in parallel from the compute node or nodes in your source cluster to the compute node or nodes in the target cluster.
The time that it takes to resize depends on the amount of data and the number of nodes in the smaller cluster.
The duration of a classic resize varies based several factors, including:
- Firstly, the workload on the source cluster.
- The number and size of the tables being transferred.
- How evenly data is distributed across the compute nodes and slices.
- The node configuration in the source and target clusters.
When you start the resize operation, Amazon Redshift puts the existing cluster into read-only mode until the resize finishes.
During this time, you can only run queries that read from the database
If you have enabled audit logging in your source cluster, you can continue to access the logs in Amazon S3.
You can keep or delete these logs as your data policies specify.
After Amazon Redshift puts the source cluster into read-only mode, it provisions a new cluster, the target cluster.
It does so using the information that you specify for the node type, cluster type, and number of nodes.
Then, Amazon Redshift copies the data from the source cluster to the target cluster.
When this is complete, all connections switch to use the target cluster.
You can view the resize progress on the Amazon Redshift console.
Amazon Redshift doesn’t sort tables during a resize operation, so the existing sort order is maintain.
When you resize a cluster, Amazon Redshift distributes the database tables to the new nodes based on their distribution styles and runs an ANALYZE command to update statistics.
Rows that are marked for deletion aren’t transferred, so you need to run only a VACUUM command if your tables to resort.
You can cancel a classic resize operation before it completes by choosing Cancel resize from the cluster details in the Amazon Redshift console.
If the resize operation is in the final stage, you can’t cancel the operation.
Snapshot, restore, and resize
The time it takes to resize a cluster with the classic resize operation depends heavily on the amount of data in the cluster.
This approach requires that any data that is written to the source cluster after the snapshot is taken must be copied manually to the target cluster after the switch.
Depending on how long the copy takes, you might need to repeat this several times until you have the same data in both clusters.
Then you can make the switch to the target cluster.
However, it minimizes the amount of time that you can’t write to the database.
The snapshot, restore, and classic resize approach uses the following process:
1.Firstly, take a snapshot of your existing cluster. The existing cluster is the source cluster.
2.Then, note the time that the snapshot was taken.
Doing this means that you can later identify the point when you need to rerun extract, transact, load (ETL) processes to load any post-snapshot data into the target database.
3.Restore the snapshot into a new cluster. This new cluster is the target cluster.
4.Resize the target cluster. Choose the new node type, number of nodes, and other settings for the target cluster.
5.Review the loads from your ETL processes that occurred after you took a snapshot of the source cluster.
6.Stop all queries running on the source cluster.
To do this, you can reboot the cluster, or you can log on as a superuser and use the PG_CANCEL_BACKEND and the PG_TERMINATE_BACKEND commands.
7.Rename the source cluster. For example, rename it from examplecluster to examplecluster-source.
8.Rename the target cluster to use the name of the source cluster before the rename.
9.Then, delete the source cluster after you switch to the target cluster, and verify that all processes work as expected.
[Need help with the process? We’d be happy to assist]
Conclusion
In short, today we saw how our Support Techs assisted with this Redshift query.
0 Comments