Centralized Control
In the early days of computing there were few computers, and they were expensive. It only made sense to share them
across many users. The time-sharing systems of the 1970's made every user think they had the whole computer to themselves
and all the data for everyone was in one place. Perfect. It was easy to manage.
Enter the 2000's. Now everyone has at least one computer, usually more, each with its own independent storage. Now it's obvious that
no one is interested in performing a half dozen independent backup operations, so a user's data must be organized somehow for storage.
That's why we have data centers for on-line storage which are morphing into the 'Cloud'.
Now let's think how a data center is organized. At the bottom are the hard-drives. Physical devices that hold blocks of data.
Variable length 'Files' exist thanks to the disk driver abstraction than tracks the blocks of each file and provides a directory
hierarchy. So far so good. The operating system of the computer works with the disk drivers to present a set of independent drives.
The computers are organized in clusters. The clusters are organized by the site's Storage Virtualization software which tracks all of the
clusters. To get a file, the client navigates down the chain, authenicates at the site, then the cluster, onto the computer and
checks file permissions at the various directories all the way down to the file. That's manageable.
Failure Modes
Now look at how this can be broken. A client with forged authentication is able to work its way all of the way down the chain.
Typically each user is assigned rights to some top-level directory, so once the roque client gets to that point it's possible to
obtain a directory listing and see all files at that point and below. If encryption is employed, it's likely only one encryption
attack is necessary as all the files share the same key.
Next, let's think about storage balancing. Disk drivers can only manage the storage of a single disk. In the old days, the
user was assigned a quota. If exceeded, the user would probably need to be moved to a bigger volume. Modern Storage Virtualization
systems take care of this by remapping storage on-the-fly. To do so, it's necessary to maintain a centralized storage map for the
entire facility -- a very fragile arrangement.
Let's start from scratch.
OK. Let's challenge these assumptions. Let's think of all storage hardware as addressable fixed-size block storage devices.
Where should we do the file to block mapping? If it's done on the storage device, we're no further ahead. If it's done by the
Storage Virtualization software, then its still a potential major point of failure. Keep going. Let's do the mapping on the client.
Wait, you think, the mapping has to be done on the server side. How else can you know what blocks are available?
That's what we do differently. Let's give each low-level storage device a large addressable range of block identification numbers.
A 1TByte device might have 64 bits, a larger device might need 96. The 1TByte device would be able to store approximately 100 million
8KByte blocks, which would be sparsely mapped to 264 or 18 billion, billion blocks. We put some firmware on the device that
allows it to accept requests to store a block against a 64 bit block id and return that block when queried with the same block id.
Next we use a client that takes a unique set of encryption codes (we call it the Personal Encryption Code or PEC) and combines
it with a pathname for a file to generate a schedule for the storage of blocks across a known set of storage devices (which are
identified by their own unique server id). This schedule is used by the client to directly interogate each storage device to ask it to
store an encrypted block of that file mapped to the specific block id from that row of the schedule. The block of data is pure binary,
and there are no pointers to associate the block with the file to which it belongs. After the storage is complete, the schedule is
discarded. It is rebuilt from the same inputs some time in the future when the file is read. This time, each of the storage devices
identified by the schedule are asked to send back the block described by the block id, and the blocks are decrpyted and resequenced
to reform the original file.
We use hardened FIPS-140 encryption technology to ensure that the schedules are unique despite the permutation of the PECs and file
names. Those of you who are familiar with the 'Birthday Paradox' know that there is a significant statistical probability that the
same 64 bit id may be generated for two independent schedules, but we have a clever solution to that problem. We allow many different
clients to share the collective storage device pool using independently generated PECs for each user.
Why is this better?
Let's look at the advantages of this solution. Each file creates it's own storage schedule that places blocks in at least a 64 bit
number space. Each file has its own encryption key to encode the blocks, and each block is essentially anonymous. That means that an
attacker is going to have to make, on average, 263 guesses to find each block against hundreds of storage devices to crack a
file. And yet each file must be independently attacked as the schedules are different. That's many orders of magnitude more security than
anything out there. The server side does not require a high degree of physical security -- hacking a storage device is not enough to break a file.
Now consider load balancing. The storage schedules are calculated to ensure equal probabilities of block ids and storage server ids.
Each storage device will continue to accept requests to store blocks until it is filled. This means that file storage is balanced perfectly
across all availabe resources -- without the need to redistribute storage.
We store each block three times on three different devices. If a storage device fails, there are two other devices identified in the schedule
who can serve up that block instead. We have implemented a rebalancing mechanism that efficiently allows the failed server to be replaced
and populated with the blocks that were lost to ensure the population of three redundant blocks remains. From hardware point of view, this
costs no more that than conventional tiered solution as we do no need a disk-to-disk or tape layer to maintain data. Removing the need for
a backup process saves on HR costs.
Each user can possess many PECs defining many independent storage drives. We store their collection of PECs in a Credential Reqistry using
the same file storage technology. Key management is therefore just a matter of maintaining the credentials of this registry for each user.
We do this through a series of Challenge Questions which the user must answer to gain access to their Credential Registry, but we do this
without storing those answers. The user must hang on to those answers. We simply do not have them to release should we be approached by the courts.
The storage devices do not need to be owned by one business entity. They can be operated collectively as long as Service Level Agreements
are in place to protect the end user from prolonged outages of a large number of devices. An obvious source of these resources are the Cloud
Computing providers. Esotera is essentially the interface between the raw and unsecured storage of the Cloud and the high-level storage service
offered to the end user.
|