4.8 RAID, Volumes & Snapshots
RAID
It is Redundant Array of Independent Disks. It puts a whole bunch of disks (volumes) together to be one volume and they act as one disk to the operating system. All RAID is accomplished at the software level. There are different RAIDs:
RAID 0: Striped, No Redundancy, Good Performance. Usually used in gaming PC. You got two or more disks and stripe across those disks. Pros is that it has good performance. Cons is that it doesn't have redundancy. For example, you have 3 or 4 disks and you create a single volume across these 3 or 4 disks, if you lose one of those disks, if one of those disk fails, you will lose your entire RAID array.
RAID 1: Mirrored, Redundancy. You got one disk, and you mirror it an exact copy to another disk. Pros is that if one disk fails, you can still use this RAID through using the other disk. This is called redundancy.
RAID 5: Good for reads, bad for writes, AWS doesn't recommend ever puting RAID 5's on EBS. You got 3 or more disks in RAID 5, and you write parity to another disk. It is a spread across multiple disks and you write a parity. Parities basically is a check sum. So one of those disks fails, you can rebuild the RAID array using a check sum.
RAID 10: Striped & Mirrored, Good Redundancy, Good Performance. RAID 10 = RAID 1 + RAID 0.
When you are not getting the disk IO that you require so you may have a single volume and you have provisioned that to its maximum size and you still need higher disk IO. Your solution could be adding multiple volumes and you create a RAID array to give the disk IO that you desire. Usually, your RAID arrays are going to be either RAID 0 or RAID 10 (good performance is always prefered) on AWS. So RAID is a way that you are taking EBS volumes and you are putting them together to create a single volume in a Redundant Array of Independent Disks.
In the exam, a question could be, how do you improve your disk IO, the answer could be adding more EBS volumes and creating a RAID (0 or 10) which stripes across them. The question could also be, you need to use a database which is not one of the database services of AWS, like Cassandra, you have to install it on your EC2 instance, so when you want higher disk IO, you can use RAID.
Lab - use RAID on Windows instances (optional)
In the security group of your instances, you need to add RDP which is developed by Microsoft to provide you with a graphical interface to connect to another computer over a network connection. When you created an instance, your username is administrator, and then go to get Windows password, then use MSTSC on your local Windows PC to connect your instance. When you have gone into your instance, go to your Disk Management, leave the boot volume alone, and delete all attached volumes, and then create a new stripe volume (RAID 0), and add all attached volumes into it. This will give you a RAID 0 which stripe across all your attached volumes.
How to take a snapshot of a RAID array
Problem: Taking a snapshot, the snapshot excludes data held in the cache by applications and the OS. This tends not to matter on a single volume, however using multiple volumes in a RAID array, this can be a problem due to inter-dependencies of the array.
Solution: Taking an application consistent snapshot.
You need to do:
Stop the application from writing to disk.
Flush all caches to the disk.
How can we do this? (Just choose 1 method from below)
Freeze the file system.
Unmount the RAID Array.
Shutting down the associated EC2 instance, take a snapshot, and power up again. (easiest way)
Last updated
Was this helpful?