Wondering how to COPY or UNLOAD data from Amazon Redshift to Amazon S3 bucket? We can help you!
Here, at Bobcares, we often handle similar requests from our AWS customers as a part of our AWS Support Services.
Today, let’s see how our Support Engineers help our customers to COPY or UNLOAD data from Amazon Redshift to Amazon S3 bucket.
COPY or UNLOAD data from Amazon Redshift to Amazon S3 bucket
We can access Amazon S3 data that is present in a different account from where Amazon Redshift account that we are using.
Our Support Engineers follows the below steps to perform this task:
- At first, we should create an IAM role in the AWS S3 account(RoleX)
2. Then, we should create an IAM role in the Amazon Redshift Account(RoleY)
3. And finally, we have to test the cross-account access between RoleX and RoleY.
Creating an IAM role in the S3 account(RoleX)
Let’s see the steps followed by our support techs to create an IAM role in the AWS S3 account.
- Log in to the AWS Management Console and open the IAM console.
2. Select Policies and then click Create Policy.
3. Click the JSON tab.
4. Then add the following policy document :
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"kms:Decrypt",
"kms:Encrypt",
"kms:GenerateDataKey"
],
"Resource": [
"<KMS_KEY_ARN_A_Used_for_S3_encryption>"
]
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:Get*",
"s3:List*"
],
"Resource": [
"arn:aws:s3:::<bucketname>",
"arn:aws:s3:::<bucketname>/*"
]
}
]
}
Here, replace the <bucketname> with the actual bucket name.
5. Then click Review policy.
6. Enter the policy name (e.g. RoleX_access)
7. Then, click Create policy to create the policy.
8. From the dashboard, select IAM and then select Roles.
9. Select Create role.
10. Select Another AWS account as the trusted entity type.
11. Then, enter the AWS account ID of the account that’s using Amazon Redshift.
12. Select Next: Permissions, and then select the policy that we created.
13. Select Next: Tags, and then select Next: Review.
14. Enter a role name (RoleX).
15. Click Create role.
Creating an IAM role in the Amazon Redshift account(RoleY)
Now, let’s see the steps to create an IAM role in the Amazon Redshift account with permissions to assume RoleX.
- Log in to the AWS Management Console and open the IAM console.
2. Select Policies and then click Create Policy.
3. Click the JSON tab.
4. Then add the following policy document :
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "CrossAccountPolicy",
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "<S3AccountRoleARN>"
}
]
}
Here, replace the <S3AccountRoleARN> with the ARN for RoleA.
5. Then click Review policy.
6. Enter the policy name (e.g. RoleY_access)
7. Then, click Create policy to create the policy.
8. From the dashboard, select IAM and then select Roles.
9. Select Create role.
10. Select AWS service as the trusted entity type.
11. Select Redshift and then select Redshift – Customizable.
12. Select Next: Permissions, and then select the policy that we created.
13. Select Next: Tags, and then select Next: Review.
14. Enter a role name (RoleY).
15. Click Create role.
16. Then we should associate the IAM role (RoleY) with the Amazon Redshift cluster.
To associate the IAM role (RoleY) with the Amazon Redshift cluster:
- Log in to the AWS console and open Reshift Console.
2. Select CLUSTERS, then select the name of the cluster.
3. Select Actions and then select Manage IAM roles.
4. Then we can either select Enter ARN and then enter an ARN or an IAM role.
5. Or select an IAM role from the list. Then select Add IAM role to add it to the list of Attached IAM roles.
6. Click Done to associate the IAM role with the cluster.
The chaining of IAM roles in Redshift gives Amazon Redshift access to Amazon S3.
Test the cross-account access between RoleX and RoleY
Now let’s see how to test the cross-account access between RoleX and RoleY.
- First, we should run the COPY command to bring the data from the AWS S3 bucket to Amazon Redshift:
copy table_name from 's3://<bucketname>/crosscopy1.csv' iam_role 'arn:aws:iam::Redshift_Account_ID:role/RoleY,arn:aws:iam::S3_Account_ID:role/RoleX' delimiter ',' removequotes;
Replace the below values in the COPY command:
table_name: The Redshift table that we want to copy the Amazon S3 data into.
s3://<bucketname>/crosscopy1.csv: The S3 bucket that we want to copy the data from.
Redshift_Account_ID: The AWS account ID for the Redshift account
RoleY: The second IAM role we created.
Role X: The first IAM role we created.
S3_Account_ID: The AWS account ID for the S3 account
2. Then, we should run the UNLOAD command to unload the data from Amazon Redshift to the AWS S3 bucket.
unload ('select * from table_name') to 's3://<bucketname>/folder/table_name_' iam_role 'arn:aws:iam::Redshift_Account_ID:role/RoleY,arn:aws:iam::S3_Account_ID:role/RoleX' KMS_KEY_ID 'ARN_KMS_KEY_ID' ENCRYPTED;
Replace the below values in the UNLOAD command:
table_name: The Redshift table that we want to unload to the Amazon S3 bucket.
s3://<bucketname>: The S3 path to unload the Redshift data.
Redshift_Account_ID: The AWS account ID for the Redshift account
RoleY: The second IAM role we created.
RoleX: The first IAM role we created.
S3_Account_ID: The AWS account ID for the S3 account.
Important Notes:
- All the steps will work in any case. But there may be changes in COPY or UNLOAD command syntax.
- The above steps suppose that the Redshift cluster and the Amazon S3 bucket are in the same region, If not, we must add the region parameter with the COPY or UNLOAD command.
- The AWS KMS permissions are not needed if the Amazon S3 bucket is not encrypted with the AWS-KMS key.
[Need help with more AWS queries? We’d be happy to assist]
Conclusion
To conclude, today we discussed how our Support Engineers help our customers to COPY or UNLOAD data from Amazon Redshift to Amazon S3 bucket.
0 Comments