Elastic Block Storage vs. S3: AWS Storage Options
175 Zetabytes. That's how much data is expected to be produced annually by 2025. To put that in perspective, 1 zettabyte is 1 trillion gigabytes. According to the same study, 59% of that will be stored in the cloud. Whether you need data stored in a relational database, a NoSQL database, or a file system, Amazon has every possible business need covered with several options.
While a multitude of different storage options is wonderful, it can lead new cloud developers into a state of analysis paralysis. After all, it might not be obvious which storage service is mostsuitable for your needs.
This article will break down two of the more common data storage services provided by AWS: S3 and EBS — and which one is better suited for a given situation.
S3 and EBS: A Quick Look
S3, or Simple Storage Service, is Amazon's primary service for data storage. Along with EC2 and SQS, S3 was the flagship service Amazon introduced back in 2006. EBS was launched later in 2008, and was created with EC2 instances in mind.
What problems were the folks at Amazon trying to solve when they created these highly durable storage services? Let's take a look at S3 first.
Why Cloud Storage: Universally Accessible Storage
All businesses need a safe place to store, retrieve, and analyze their data. However, prior to AWS, and the cloud in general, organizations had to rely on homegrown data facilities. Unfortunately, the more data a business consumes, the more unwieldy data centers become. The organization needs to worry about myriad of problems such as security, disaster recovery, and maintenance.
Data analysis has become more and more critical to a business's success, which means that these organizations constantly need to hire and rely on individuals to curate and maintain their data centers. Organizations began spending too much time, money, and resources on data storage and retrieval, instead of spending that time on their core business objectives. This is the problem S3 alleviated.
Instead of managing all of this data in a data center, it can now be uploaded into the cloud. Your organization will never have to worry about purchasing more server racks, dealing with faulty cables, or contending with natural disasters. The entire onus of data hardware is taken care of by S3.
What Should be Stored Using S3?
The majority of AWS users utilize S3 to store frequently accessed static data. Static data broadly consists of photos, documents, HTML pages, and other customer data. The reason S3 is best for static content is because of the method in which the data is stored.
All S3 data is stored as a key-value pair. This means that it is very easy for the computer to quickly retrieve this data. Think of it like opening the back of a chemistry book and looking at the index. Say you need to access the information about carbon. You will scan the alphabetized index and quickly find the page number with the information about carbon. S3 operates just like that.
This is why it is such a good candidate (unlike EBS) for housing data you need to quickly serve up, such as photos or static web pages — especially if that data will be served in multiple locations.
If S3 is optimal for serving static documents, what is EBS good for? Let's find out.
EBS: An EC2 Instance's Pack Horse
One huge difference between EBS (Elastic Block Storage) and S3 is data accessibility. As long as the S3 bucket is available to the public, its endpoint can be accessed anywhere. EBS on the other hand, is mounted to a particular EC2 instance. That may sound like a drawback, but it really isn't. This is because EBS was designed to house entirely different data than S3.
Software applications are a prime example of data that would be stored on EBS. Let's say that you ran a business that required each EC2 instance to have Microsoft Exchange, Sharepoint, and Office installed. Because EBS is always mounted to a particular EC2 instance, the user could install all these applications onto their EBS. Then, they can replicate that EBS and have the exact same configurations mounted onto a different EC2 instance. Voila! Quick and easy mass production of workspaces.
Another great advantage of EBS is its traditional method of navigation. Once an EBS is mounted to an EC2 instance, it is navigated just like a file directory. Anyone with a Linux or Windows background would be able to find files and directories on an EBS. S3 operates differently, and is not navigated in the same manner.
Recall that S3 is not stored in a file structure, it is stored in key value pairs. So, the service does not accommodate navigation in the same fashion as EBS; S3 specializes in data retrieval and upload after all.
EBS vs. S3: How to Choose
Because EBS is mounted directly to an EC2 instance, you may think that EBS would read and write data faster than S3. This is exactly the case. EBS has a quicker read and retrieval time than S3 because the EC2 instance does not have to travel to a different location to access the data.
This low latency behavior is perfect if the user needs to install a database onto their EBS. A database would not be a good candidate for storage on S3. The database may be part of a software developer's environment, which is why it is tethered to a particular EC2 instance. Remember that the first S in S3 is simple. A database is not simple, therefore it should not be associated with S3.
Speaking of databases, what if you wanted to replicate an entire environment? Say, for example, you are onboarding a brand new developer to your company. Wouldn't it be great if you could just point him or her to a location that has everything they need for their testing environment? Well, you can with EBS. Elastic Block Storage will mount to their EC2 instance with everything they need to begin developing immediately; up to and including an IDE, a database, and any other company approved software.
Does this mean that you should never store static files in an EBS? Not necessarily. There may be instances where a static file doesn't need to be shared, but is required for that particular EC2 instance. The data retrieval for that particular document would be far quicker than from an S3 bucket. Web pages, however, are absolutely the domain of S3. This is because the web pages need to be accessed from anywhere on the internet, and has nothing to do with any particular EC2 instance.
The difference between EBS and S3 may seem nuanced at first, but upon further analysis it is clear they are two completely different beasts. S3 is optimized for housing static files. Remember the key word in its name: Simple Storage Service. S3 is brilliant for anything such as documents, photos, or general user data. If the data needs to be accessed on the internet, or from multiple different locations, then think of S3.
Elastic Block Storage is for any data that is associated with a particular EC2 instance. If a user needs to set up a test environment for coding, EBS is the perfect candidate. If someone is talking about software installation, that should set off a red flag: this data is not for S3.
EBS and S3 are both valuable data storage tools and have their own unique benefits and drawbacks. We have only scratched the surface of each of these services, but hopefully you now have a better understanding of how each one of these storage units differ from each other.