Bobcares

AzCopy to S3: Data transfer tutorial

by | Nov 1, 2021

AzCopy to S3 has made copying files, directories, and buckets an easy task.

As a part of our AWS Support Service, we offer solutions to every query that comes our way.

Let’s take a look at this in-depth tutorial about AzCopy by our skilled Support Engineers.

AzCopy to s3

AzCopy is a CLI utility that we use to copy files or blobs to or from a storage account. Today, we are going to take a look at how to use this handy tool to copy buckets, directories, and objects from AWS S3 to Azure Blob Storage with the help of AzCopy.

Before we begin we have to offer the following authorization credentials:

  • To authorize with Azure Storage, use SAS (Shared Access Signature) or AD (Azure Directory) token.
  • To authorize with AWS S3 as well use a secret access key and an AWS access key.

How to Authorize with Azure Storage

First, we will download AzCopy and choose how to provide authorization credentials to Azure Storage.

Our Support Engineers would like to point out that the examples in this tutorial assume your identity is authenticated via the AzCopy login command. This way, your Azure AD account authorizes access to data in Blob Storage.

However, we can also use SAS token for authorization as long as the token is appended to the resource URL in each AzCopy command.

How to Authorize with AWS S3

Here, we will set the environment variables after getting hold of the AWS access key and secret access key:

  • macOS:
    export AWS_ACCESS_KEY_ID=<access-key>
    export AWS_SECRET_ACCESS_KEY=<secret-access-key>
  • Linux:
    export AWS_ACCESS_KEY_ID=<access-key>
    export AWS_SECRET_ACCESS_KEY=<secret-access-key>
  • Windows:
    set AWS_ACCESS_KEY_ID=<access-key>
    set AWS_SECRET_ACCESS_KEY=<secret-access-key>

AzCopy to s3: Copy objects, directories & buckets

AzCopy uses Put Block From URL API to copy data directly between storage servers and AWS S3. In other words, the network bandwidth of the machine is not used.

Our Support Engineers would like to point out that the examples in this article include path arguments within single quotes (“).

In other words, we have to use single quotes in all command shells except cmd.exe. In case, we turn to cmd.exe we will use double quotes (“”) instead.

Copy an object

We will use the same URL syntax for accounts with a hierarchical namespace.

Syntax

azcopy copy https://s3.amazonaws.com/<bucket-name>/<object-name>' 'https://<storage-account-name>.blob.core.windows.net/<container-name>/<blob-name>'

For instance:

azcopy copy 'https://s3.amazonaws.com/mybucket/myobject' 'https://mystorageaccount.blob.core.windows.net/mycontainer/myblob'

Copy a directory

We will use the same URL syntax for accounts with a hierarchical namespace.

Syntax

azcopy copy
'https://s3.amazonaws.com/<bucket-name>/<directory-name>'
'https://<storage-account-name>.blob.core.windows.net/<container-name>/<directory-name>'
--recursive=true

For instance:

azcopy copy 'https://s3.amazonaws.com/mybucket/mydirectory' 'https://mystorageaccount.blob.core.windows.net/mycontainer/mydirectory' –recursive=true

Copy contents of a directory

We can copy the contents of a directory without having to copy the entire directory by using the * or wildcard symbol.

Syntax

azcopy copy
'https://s3.amazonaws.com/<bucket-name>/<directory-name>/*'
'https://<storage-account-name>.blob.core.windows.net/<container-name>/<directory-name>'
--recursive=true

For instance:

azcopy copy 'https://s3.amazonaws.com/mybucket/mydirectory/*' 'https://mystorageaccount.blob.core.windows.net/mycontainer/mydirectory' –recursive=true

Copy a bucket

We will use the same URL syntax for accounts with a hierarchical namespace.

Syntax

azcopy copy 'https://s3.amazonaws.com/<bucket-name>'
'https://<storage-account-name>.blob.core.windows.net/<container-name>'
--recursive=true

For instance:

azcopy copy 'https://s3.amazonaws.com/mybucket' 'https://mystorageaccount.blob.core.windows.net/mycontainer' --recursive=true

Copy all buckets in a specific S3 region

We will use the same URL syntax for accounts with a hierarchical namespace.

Syntax

azcopy copy 'https://s3-<region-name>.amazonaws.com/'
'https://<storage-account-name>.blob.core.windows.net'
--recursive=true

For instance:

azcopy copy 'https://s3-rds.eu-north-1.amazonaws.com' 'https://mystorageaccount.blob.core.windows.net' --recursive=true

Copy all buckets in all regions

We will use the same URL syntax for accounts with a hierarchical namespace.

Syntax

azcopy copy 'https://s3.amazonaws.com/' 'https://<storage-account-name>.blob.core.windows.net' --recursive=true

For instance:

azcopy copy 'https://s3.amazonaws.com' 'https://mystorageaccount.blob.core.windows.net' –recursive=true

AzCopy to S3: Handling differences in object naming rules

Did you know the naming conventions for bucket names are different while comparing AWS S3 and Azure blob containers?

Fortunately, AzCopy takes care of this issue. In other words, AzCopy checks for naming collisions and ultimately attempts to resolve them while copying files.

For instance, AWS S3 bucket names with periods and consecutive hyphens are replaced by AzCopy.

AzCopy to S3: Handling differences in object metadata

Another key point to look out for while copying files is to remember that AWS S3 and Azure permit different character sets in object key names.

Fortunately, this is resolved by the AzCopy copy command by using the s2s-handle-invalid-metadata flag. The value of the flag indicates how the incompatible key names have to be handled. This table describes each flag value:

  • FailIfInvalid:
    Objects are not copied. AzCopy logs an error. It also includes the error in the failed count in the transfer summary.
  • ExcludeIfInvalid:
    This is the default option. It indicates that the metadata is not included in the transferred object. Moreover, AzCopy logs a warning.
  • RenameIfInvalid:
  • AzCopy resolves the invalid metadata key. Furthermore, it copies the object via the resolved metadata key-value pair. In case AzCopy is not able to rename the key, the object is not copied.

AzCopy to S3: How to rename object keys

We will perform the following steps to rename object keys:

  1. First, we will replace all invalid characters with ‘_’.
  2. Then, we will add rename_ to the start of a new valid key. In fact,t his key will contain the original metadata value.
  3. After that, we will addrename_key_ to the start id a new valid key and use it to save the original metadata invalid key.

[Need a hand with Server Management? We are here to help.]

Conclusion

In brief, this tutorial by the skilled Support Engineers at Bobcares demonstrated how to use AzCopy to copy files between AWS S3 and Azure.

PREVENT YOUR SERVER FROM CRASHING!

Never again lose customers to poor server speed! Let us help you.

Our server experts will monitor & maintain your server 24/7 so that it remains lightning fast and secure.

GET STARTED

1 Comment

  1. camilia

    Great article, for me I use Gs Richcopy 360

    Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

Never again lose customers to poor
server speed! Let us help you.

Privacy Preference Center

Necessary

Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.

PHPSESSID - Preserves user session state across page requests.

gdpr[consent_types] - Used to store user consents.

gdpr[allowed_cookies] - Used to store user allowed cookies.

PHPSESSID, gdpr[consent_types], gdpr[allowed_cookies]
PHPSESSID
WHMCSpKDlPzh2chML

Statistics

Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.

_ga - Preserves user session state across page requests.

_gat - Used by Google Analytics to throttle request rate

_gid - Registers a unique ID that is used to generate statistical data on how you use the website.

smartlookCookie - Used to collect user device and location information of the site visitors to improve the websites User Experience.

_ga, _gat, _gid
_ga, _gat, _gid
smartlookCookie
_clck, _clsk, CLID, ANONCHK, MR, MUID, SM

Marketing

Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.

IDE - Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.

test_cookie - Used to check if the user's browser supports cookies.

1P_JAR - Google cookie. These cookies are used to collect website statistics and track conversion rates.

NID - Registers a unique ID that identifies a returning user's device. The ID is used for serving ads that are most relevant to the user.

DV - Google ad personalisation

_reb2bgeo - The visitor's geographical location

_reb2bloaded - Whether or not the script loaded for the visitor

_reb2bref - The referring URL for the visit

_reb2bsessionID - The visitor's RB2B session ID

_reb2buid - The visitor's RB2B user ID

IDE, test_cookie, 1P_JAR, NID, DV, NID
IDE, test_cookie
1P_JAR, NID, DV
NID
hblid
_reb2bgeo, _reb2bloaded, _reb2bref, _reb2bsessionID, _reb2buid

Security

These are essential site cookies, used by the google reCAPTCHA. These cookies use an unique identifier to verify if a visitor is human or a bot.

SID, APISID, HSID, NID, PREF
SID, APISID, HSID, NID, PREF