DataSync fully automates the data transfer. It comes with retry and network resiliency mechanisms, network optimizations, built-in task scheduling, monitoring via the DataSync API and Console, and CloudWatch metrics, events and logs that provide granular visibility into the transfer process. DataSync performs data integrity verification both during the transfer and at the end of the transfer.
DataSync provides end-to-end security, and integrates directly with AWS storage services. All data transferred between the source and destination is encrypted via TLS, and access to your AWS storage is enabled via built-in AWS security mechanisms such as IAM roles. DataSync with VPC endpoints are enabled to ensure that data transferred between an organization and AWS does not traverse the public internet, further increasing the security of data as it is copied over the network.
You will need to define a source account and bucket, and a destination account and bucket. In these examples our source will be source
and the destination will be destination
.
In the source AWS account, create an IAM role that will allow the s3 resource in your AWS destination bucket.
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:GetBucketLocation",
"s3:ListBucket",
"s3:ListBucketMultipartUploads"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::destination"
},
{
"Action": [
"s3:AbortMultipartUpload",
"s3:DeleteObject",
"s3:GetObject",
"s3:ListMultipartUploadParts",
"s3:PutObject",
"s3:GetObjectTagging",
"s3:PutObjectTagging"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::destination/*"
}
]
}
Also add a trust policy to this role for Datasync
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "datasync.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
Note the ARN from this role that you created, because you will need it later. For this example it is arn:aws:iam::1234567890:role/Datasync-destination-role
Next log into your AWS destination account, go to your S3 bucket (s3://destination)that you want to land the data on, go to Permissions, and under Object Ownership, edit it and select ACLs disabled (recommended):
Go to your destination s3 bucket (s3://destination), and apply the following policy to it: Note that the Principal is the role you created in the AWS source account.
{
"Version": "2008-10-17",
"Statement": [
{
"Sid": "DataSyncCreateS3LocationAndTaskAccess",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::1234567890:role/Datasync-destination-role"
},
"Action": [
"s3:GetBucketLocation",
"s3:ListBucket",
"s3:ListBucketMultipartUploads",
"s3:AbortMultipartUpload",
"s3:DeleteObject",
"s3:GetObject",
"s3:ListMultipartUploadParts",
"s3:PutObject",
"s3:GetObjectTagging",
"s3:PutObjectTagging"
],
"Resource": [
"arn:aws:s3:::destination",
"arn:aws:s3:::destination/*"
]
},
{
"Sid": "DataSyncCreateS3Location",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::1234567890:root"
},
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::destination"
}
]
}
Save this.
In your terminal, make sure that your profile is for the AWS source account. Remember above about remembering the ARN for the role, here is where you need it.
In your terminal run:
aws datasync create-location-s3 --s3-bucket-arn arn:aws:s3:::destination --s3-config '{"BucketAccessRoleArn":"arn:aws:iam::1234567890:role/Datasync-destination-role"}'
This is creating the S3 location in your destination bucket, and configuring it to use the ARN that you defined earlier. (AWS iz teh dum, because we should just be able to do it from the console... whatevs.)
If successful you should see something like:
{
"LocationArn": "arn:aws:datasync:us-east-2:1234567890:location/loc-0b72deadbeefe2d4d3752"
}
Now that you have created the Account B destination S3 bucket location, log in to Account A and select the Region that the Account A source bucket resides in. Create the source bucket location, and select Autogenerate to create the IAM policy for this location:
Go to Datasync, and create a new source bucket location:
Once you have created both the source and destination locations, navigate to Tasks under the DataSync page and select Create task. First, select the source location, then select Next:
Next select the destination location:
Provide your task with a name and configure to your specifications. When complete, choose Next:
Lastly review your configurations and select Create task. You’re now ready to execute your task and start copying objects from the source S3 bucket to your destination S3 bucket.