3.9 Storage Gateway

Terms

  • On-premises software: it is installed and runs on computers on the premises of the person or organization using the software, rather than at a remote facility such as a server farm or cloud.

  • Off-premises software: it is commonly called “software as a service” ("SaaS") or “cloud computing”.

  • Volumes: it is the concept of the "volumes" in your windows OS computer, such as (E:), (C:), (D:).

Overview

AWS Storage Gateway is a service that connects an on-premises software appliance with cloud-based storage to provide seamless and secure integration between an organization's on-promises IT environment and AWS's storage infrastructure. The service enables you to securely store data to the AWS cloud for scalable and cost-effective storage.

AWS Storage Gateways's software appliance is available for download as a virtual machine image that you install on a host in your datacenter. Storage Gateway supports either VMware ESXi or Microsoft Hyper-V. Once you have installed your gateway and associated it with your AWS account through the activation process, you can use the AWS Management Console to create the storage gateway option that is right for you.

Structure

[Your Data Center (running a Storage Gateway)] ======> [propagate asynchronous replicate of your data up to] ======> [AWS's storage services (S3 or Glacier)]

Four Types of Storage Gateways

#1. File Gateway (NFS)

It is for flat files and for object-based storage, it allows you to store files/objects up to S3.

#2. Volumes Gateway (iSCSI)

It is for object-based storage and block-based storage, such as storing flat files or a database or operating system and back them up to EBS in the form of snapshots. Snapshot is a kind of file/object, so it will be finally stored S3 (S3 is object-based, EBS is block-based storage). Two types of Volumes Gateway:

  • Stored Volumes: this is for store a entire copy of your data set from on-premise to AWS asynchronously.

  • Cached Volumes: only store most recently data on-premise, such as access data, and the rest of data is backed off into AWS.

#3. Tape Gateway (VTL)

This is a back up or archive solution. It allows you to create virtual tapes and then send them to S3, and then you can use life-cycle policies to send those virtual tapes off to Glacier.

In details

#1. File Gateway

Files are stored as objects in your S3 buckets, accessed through a Network File System (NFS) mount point. Ownership, permissions, and time-stamps are durably stored in S3 in the user-metadata of the object associated with the file. Once objects are transferred to S3, they can be managed as native S3 objects, and bucket policies such as versioning, life-cycle management, and cross-region replication apply directly to objects stored in your bucket.Usually, we use Internet instead of Direct Connect to build the connection between your on-premise Storage Gateway and AWS. We can also use Amazon VPC (virtual private cloud), it means you don't need to run Storage Gateway on-premise, you can just run your Storage Gateway on an EC2 instance.

#2. Volume Gateway

The volume interface presents your applications with disk volumes (i.e. virtual hard disks) using the iSCSI block protocol.

Data written to these volumes can be asynchronously backed up as point-in-time snapshots of your volumes, and stored in the cloud as Amazon EBS snapshots. (Amazon Elastic Block Store is a virtual hard disk that we are going to attach to our EC2 instances which are virtual machines. EBS is just virtual hard disk attached with EC2, so we do not consider it as yet another storage service.)

Snapshots are incremental backups that capture only changed blocks. All snapshot storage is also compressed to minimize your storage charges.

There are two kinds of Volume Gateway, but I just want you to imagine it as virtual hard disks that sit there, on-premise, and then back them up to AWS S3:

  • Stored Volumes: it let you store your primary data locally, while asynchronously backing up that data to AWS. Stored volumes provide your on-premises applications with low-latency access to their entire datasets, while providing durable, off-site backups. You can create storage volumes and mount them as iSCSI devices to your on-premises application servers (web servers, database servers etc.). Data written to your stored volumes is stored on your on-premises storage hardware. This data is asynchronously backed up to Amazon S3 in the form of Amazon EBS incremental snapshots. 1GB - 16TB in size for Stored Volumes.Note: volume storage = stored volume.

  • Cached Volumes: Cached volumes let you use S3 as your primary data storage while retaining frequently accessed data locally in your storage gateway. Cached volumes minimize the need to scale your on-premises storage infrastructure, while still providing your applications with low-latency access to their frequently accessed data. You can create storage volumes up to 32TB in size and attach to them as iSCSI devices from your on-premises storage gateway's cache and upload buffer storage. 1GB - 32TB in size for Cached Volumes. Just think Cached Volumes as all the data that's written goes up to S3, the most recently read data is stays on on-premises.Note: cache storage = cached volumes. We can see that the cached volumes still need the help of stored volumes in the AWS S3 side, because S3 is not block-storage, we still need volume storage to store data in EBS, and we just store snapshots in S3.

#3. Tape Gateway

Tape Gateway offers a durable, cost-effective solution to archive your data in the AWS cloud. The VTL interface it provides lets you leverage your existing tape-based backup application infrastructure to store data on virtual tape cartridges that you create on your tape gateway. Each tape gateway is pre-configured with a media changer and tape drives, which are available to your existing client backup applications as iSCSI devices. You add tape cartridges as you need to archive your data. Supported by NetBackup, Backup Exec, Veeam etc.

Exam Tips

  • File Gateway: for flat files only (text, image, video), stored directly on S3.

  • Volume Gateway: Stored Volumes - Enable dataset is stored on site and is asynchronously backed up to S3. Cached Volume - Entire dataset is stored on S3 and the most frequently accessed data is cached on site.

  • Tape Gateway (Gateway Virtual Tape Library, VTL): used for backup and uses popular backup applications like NetBackup, Backup Exec, Veeam etc.

Last updated

Was this helpful?