Like many technologies these days, storage has come a long way. We miss you, floppy disks! Look no further than these four interesting and important ways cloud storage technologies have evolved in the past decade.
The trends in storage we’ll highlight are interrelated. They are all consequences, direct or indirect, of Moore’s Law: Gordon Moore’s observation that the count of transistors on a chip doubled every couple of years, on and on for decades. The exponential growth affected processor power, but also storage density.
Data grows exponentially
“Between the dawn of civilization and 2003, we only created five exabytes; now we’re creating that amount every two days. By 2020, that figure is predicted to sit at 53 zettabytes (53 trillion gigabytes) – an increase of 50 times.” — Hal Varian, Chief Economist at Google.
The volume of digital data in the world is growing exponentially. Nowadays, most human activities, and nearly all business activities, are mediated by — or at least accompanied by — computers. In the process, individuals and organizations are generating large volumes of data, and we’ve grown more and more dependent on that data.
From smartphones, to point-of-sale systems, to security cameras, to factory robots, and beyond, there are billions of devices generating and capturing data, which must be transmitted and stored. This need is in a feedback loop with the other trends in storage technology below.
Storage gets denser, cheaper, and more invisible
For many years, consumer storage devices, like the spinning-disk and solid-state storage that an ordinary person might buy, has mushroomed enormously in capacity while becoming remarkably cheaper.
The effects of Moore’s Law can be seen in the long-term cost trend for memory. A megabyte cost about $5.2 million in 1960, but by 1980 it was $6500, then $78 in 1990, $1 in 2000, 16 cents in 2005, 2 cents in 2010, and 0.5 cents by 2015 and today. And so thumb drives and external USB drives purchased several years ago now look pathetic in their capacity, as well as absurdly expensive.
While consumer storage products have been exploding, there’s been a prodigious growth in the storage we don’t see: in the storage all around us, that’s tucked away inside smartphones, game consoles, cars, appliances, building systems, industrial systems — and in distant data centers.
Storage moves further away
You likely carry at least few dozen gigabytes around in your pocket (for watching cat videos), but the trend has been for the largest pools of storage to move further and further away from processors — and users.
Storage devices were originally attached directly to a server, or a cluster of servers: Direct Attached Storage (DAS). In data centers, that arrangement has been supplemented by storage accessible over the LAN. Storage devices on the network can be shared, more efficiently utilized, and their capacity scaled; and their data can be more easily shared among servers.
Network Attached Storage (NAS) is a specially-designed file server on an existing LAN. NAS devices export a filesystem view (folders and files) to their clients. Storage Area Networks (SANs) typically use dedicated networking hardware — often fiber channel — to provide a shared path among servers and storage devices. SAN devices usually export a block view (like disk blocks) to their clients. More recent SAN variations can also use vanilla networking, such as high-speed Ethernet (FCoE) or IP (SAN over IP, iSCSI).
Storage in the cloud
As the largest pools of storage move further away, transitioning to the cloud is the next step.
Cloud-based storage provides enormous volumes of storage managed and monitored by a cloud provider that guarantees a degree of availability and durability hard to achieve on your own. Sharing storage with other customers generally is a win in both cost and agility.
The cloud is founded on virtualization. Cloud storage is implemented on physical devices in multiple locations but aggregated into pools that are invisibly carved up among customers and hiding the particulars of the underlying hardware. The cloud provider gives you a simplified management interface and an automated process to fulfill requests, so you can rapidly create and provision virtual resources from the shared pools, paying as you go.
Object-based storage is a storage technology particularly suited to the immense distributed volume of data in the cloud. Rather than appearing to be disk blocks (like SAN) or a filesystem of folders and files (like NAS), object storage exposes binary objects: blobs that contain a file’s contents, its metadata (like owner, date, etc.), and other attributes. The objects are stored in a flat list, indexed by object IDs (OIDs) that are derived from the file’s contents and other attributes. The flat address space of OIDs works well for scalability and geographic distribution.
Amazon’s cloud offering, Amazon Web Services (AWS), offers a variety of storage options, and we can see the storage technologies above reflected there. Elastic Block Storage is block oriented, like SAN. Elastic File System is file system-oriented, like NAS. And Simple Storage Service (S3) is an object-based storage system.
Evolution of storage
Storage in the cloud will continue to be an occasion for innovation. Big data science and artificial intelligence both need massively parallel processing of huge volumes of data, and cloud services will need to grow to support that rapid change.
Meanwhile, with gigabytes in our pockets and the cloud accessible, we won’t miss our floppy disks… much.