Migrating Data Backup to AWS: Step by Step

Gilad David Maayan
Published 05/26/2021
Share this on:

data stream

Data backups are a critical infrastructure for any organization. Infrastructure as a service (IaaS) cloud providers offer compelling advantages for backups, but for enterprises with a large investment in legacy backup systems, migration can be challenging.

This article will explain the benefits of migrating backup to the cloud, describe the backup offering in the Amazon Web Services (AWS) cloud, and show one way to migrate large data volumes to AWS.

Why Should You Migrate Backups to the Cloud?

There are several main reasons for moving your backups to the cloud:

Reducing costs

Building and maintaining a local data backup infrastructure can be very expensive. Costs include time spent by IT staff managing backups, purchasing and maintaining redundant servers, software licenses, and storage equipment.

By migrating the entire backup operation to the cloud, organizations can eliminate local equipment such as LTO tapes, disk drives, and servers, and the time required to manage them. Cloud-based backup systems are billed per actual use, usually according to storage and data transfer volumes.

Elastic scalability

Local backups are difficult to scale as data volumes grow. By contrast, cloud backup services offer unlimited storage. Cloud backup allows you to add storage resources or user licenses as data volumes grow; scaling is handled automatically.

Faster disaster recovery

Cloud-based data backup can be significantly faster to recover during disasters. Depending on the system, you should be able to bring up your backups within minutes or even seconds. A local backup, on the other hand, can take longer to recover. Usually, local recovery requires IT teams to deal with scripting, as well as manual administration and intervention, all of which can take a lot of time and effort.

What is AWS Backup?

AWS Backup provides a fully-managed, cloud-based backup solution that centralizes and automates the process. The service automates manual backup tasks, including those previously performed with manual scripts and implemented service by service.

You can use AWS backup to manage backups for cloud-native systems, or systems you have migrated to the Amazon cloud. For systems that remain on-premises, Amazon offers the AWS Storage Gateway (backups via Storage Gateway are outside the scope of this article).

AWS Backup provides a wide range of features and capabilities, including policy-based backup, tag-based backup policies, backup activity monitoring, and lifecycle management policies.

Policy-Based Backup Solutions

AWS Backup offers the use of backup policies, or backup plans, when defining backup specifications. You can apply backup plans to any AWS resources across all AWS services.

Once you have a backup strategy in place, you can create unique and separate backup plans for each business objective and regulatory compliance standard. You can implement backup plans across various applications and easily scale according to changing business needs and requirements.

Tag-Based Backup Policies

You can leverage tags when implementing backup plans. You can simply tag the relevant AWS resources, and then easily locate and implement backup plans and other backup processes. Tags help increase visibility into resources, and are highly useful for setting up monitoring processes that help you ensure all critical applications and data are backed up.

Backup Activity Monitoring

AWS Backup comes with a user-friendly dashboard that simplifies backup and restore activities across all your AWS services. To access the dashboard, you only need to log into the AWS Backup console, where you can see the status of any recent backup jobs. Additionally, the console lets you restore jobs across all your AWS services and resources.

Lifecycle Management Policies

Backup storage can quickly incur overhead. To ensure you are not wasting resources on infrequent backup data, AWS Backup offers the use of lifecycle management policies. The policies let you choose the storage type for your backup data. For example, you can set up your policies to move backup data from warm storage repositories to a cold storage tier. You can define the schedule of the process, and then let the policies automatically perform the process.

What is AWS DataSync?

AWS provides a portfolio of data transfer services for migration projects. These include AWS Storage Gateway, which lets you connect and extend on-premises applications to AWS storage, and AWS snowmobile, a storage appliance that lets you physically ship 80TB of data to the Amazon data center.

This article focuses on AWS DataSync, which can transfer large data volumes from on-premise data centers to Amazon storage services, such as S3 and Amazon Elastic File System (EFS). The service can automatically handle a wide range of data transfer tasks that might otherwise slow down your migration process or decrease IT and development productivity.

AWS DataSync can, for example, automatically run your instances, handle encryption processes, manage scripts, perform network optimization, and validate data integrity. You can also leverage the service to implement continuous replication for data protection and recovery processes, by copying data over a fast Internet connection or AWS Direct Connect.

Migrating Backup Storage to AWS with DataSync: Step by Step

Once you are ready to move backup to AWS, here is how to transfer existing backup data into the Amazon cloud via the DataSync service.

1. Agent Setup

Start by deploying the on-premises DataSync agent as a virtual machine (VM), via the AWS console or CLI associated with your AWS account. See Amazon’s instructions for deploying the agent in a VMware, KVM or Hyper-V environment.

After you set up the agent, you need to activate it before it can be automatically managed and updated by AWS. The agent then becomes associated with your AWS account and gets recorded in the DataSync console and API.

Note that you should deploy the proxy locally near the source file system, if you want to minimize the distance at which files are transferred over low-performance protocols, such as NFS. The DataSync agent transfers files over the WAN using Amazon’s proprietary, high-speed protocol.

2. Creating a Task for File Transfer

Once the agent is deployed, you can create a migration task using the AWS Console or CLI. Configure the source location—it can be defined as an NFS file path, an SMB file path, or an S3-compatible object storage device (see documentation).

If you are working in NFS, To make sure DataSync has read access to all files, export a file share with the no_root_squash flag. Use this command to make sure the server is correctly exporting the required file path:

showmount -e {address-of-nfs-server}

After you create the migration task, the DataSync agent automatically mounts the specified path.

Next, configure the target for your migration. You can set this up for either Amazon S3 buckets or Amazon EFS file systems. Either way, DataSync can securely access the relevant destination location. Here are key aspects to consider:

  • During data transfers to S3—DataSync can automatically create IAM roles that allow secure access to the bucket.
  • During data transfers to EFS—DataSync installs the file system within a VPC using an elastic network interface.

DataSync can create, manage and delete elastic network interfaces (ENIs) on your behalf.

3. Configuring the Task

DataSync automatically performs data integrity checks during transmission of each data packet. Once the transfer is complete, DataSync can perform additional validation if needed, by comparing all files in the source to all files in the destination—to disable this uncheck the option for “Enable verification”.

Your Internet connection can be used to migrate several workloads, simultaneously. For this purpose, you can limit the bandwidth DataSync uses by selecting the “set bandwidth limit” option.

When done, select “Next” twice to confirm your configuration and complete the task. You can now click “Start” to run the migration task.

Conclusion

In this article I covered the advantages of migrating backup infrastructure to the cloud, explained the AWS Backup solution, which manages backups inside the Amazon cloud, and discussed how to migrate large volumes of data into Amazon using the DataSync tool. I hope this will be valuable in your organization’s journey to a resilient, secure cloud-based backup solution.

 

Author Bio: Gilad David Maayan

Gilad David Maayan is a technology writer who has worked with over 150 technology companies including SAP, Imperva, Samsung NEXT, NetApp and Ixia, producing technical and thought leadership content that elucidates technical solutions for developers and IT leadership. Today he heads Agile SEO, the leading marketing agency in the technology industry.

LinkedIn: https://www.linkedin.com/in/giladdavidmaayan/