dataSentinel. Simple. Safe. Secure.

It's all about thinking outside of the box.

Our fresh-thinking approach takes a step back to examine the shortcomings of conventional storage technology.

 
   
Click to see how dataSentinel protects your files.
[View sample storage schedule]

dataSentinel white papers
Personal Information
Security Model
Security Space
Disruptive Technology

Download our PDF brochure.
Click here.
dataSentinel has a solution
for your specific situation:

Solutions for small business.

Enterprise solutions.

Solutions for home computers.

Solutions for accessibility.

 
 

What's the problem with the way it's done now? If it ain't broke....

Read below for the technical details. Or if you really don't want to get into too much techspeak, click here for a more user-friendly presentation. And be sure to check out our blog!

Centralized Control
In the early days of computing there were few computers, and they were expensive. It only made sense to share them across many users. The time-sharing systems of the 1970's made every user think they had the whole computer to themselves and all the data for everyone was in one place. Perfect. It was easy to manage.

Enter the 2000's. Now everyone has at least one computer, usually more, each with its own independent storage. Now it's obvious that no one is interested in performing a half dozen independent backup operations, so a user's data must be organized somehow for storage. That's why we have data centers for on-line storage which are morphing into the 'Cloud'.

Now let's think how a data center is organized. At the bottom are the hard-drives. Physical devices that hold blocks of data. Variable length 'Files' exist thanks to the disk driver abstraction than tracks the blocks of each file and provides a directory hierarchy. So far so good. The operating system of the computer works with the disk drivers to present a set of independent drives. The computers are organized in clusters. The clusters are organized by the site's Storage Virtualization software which tracks all of the clusters. To get a file, the client navigates down the chain, authenicates at the site, then the cluster, onto the computer and checks file permissions at the various directories all the way down to the file. That's manageable.

Failure Modes
Now look at how this can be broken. A client with forged authentication is able to work its way all of the way down the chain. Typically each user is assigned rights to some top-level directory, so once the roque client gets to that point it's possible to obtain a directory listing and see all files at that point and below. If encryption is employed, it's likely only one encryption attack is necessary as all the files share the same key.

Next, let's think about storage balancing. Disk drivers can only manage the storage of a single disk. In the old days, the user was assigned a quota. If exceeded, the user would probably need to be moved to a bigger volume. Modern Storage Virtualization systems take care of this by remapping storage on-the-fly. To do so, it's necessary to maintain a centralized storage map for the entire facility -- a very fragile arrangement.

Let's start from scratch.
OK. Let's challenge these assumptions. Let's think of all storage hardware as addressable fixed-size block storage devices. Where should we do the file to block mapping? If it's done on the storage device, we're no further ahead. If it's done by the Storage Virtualization software, then its still a potential major point of failure. Keep going. Let's do the mapping on the client.

Wait, you think, the mapping has to be done on the server side. How else can you know what blocks are available? That's what we do differently. Let's give each low-level storage device a large addressable range of block identification numbers. A 1TByte device might have 64 bits, a larger device might need 96. The 1TByte device would be able to store approximately 100 million 8KByte blocks, which would be sparsely mapped to 264 or 18 billion, billion blocks. We put some firmware on the device that allows it to accept requests to store a block against a 64 bit block id and return that block when queried with the same block id.

Next we use a client that takes a unique set of encryption codes (we call it the Personal Encryption Code or PEC) and combines it with a pathname for a file to generate a schedule for the storage of blocks across a known set of storage devices (which are identified by their own unique server id). This schedule is used by the client to directly interogate each storage device to ask it to store an encrypted block of that file mapped to the specific block id from that row of the schedule. The block of data is pure binary, and there are no pointers to associate the block with the file to which it belongs. After the storage is complete, the schedule is discarded. It is rebuilt from the same inputs some time in the future when the file is read. This time, each of the storage devices identified by the schedule are asked to send back the block described by the block id, and the blocks are decrpyted and resequenced to reform the original file.

We use hardened FIPS-140 encryption technology to ensure that the schedules are unique despite the permutation of the PECs and file names. Those of you who are familiar with the 'Birthday Paradox' know that there is a significant statistical probability that the same 64 bit id may be generated for two independent schedules, but we have a clever solution to that problem. We allow many different clients to share the collective storage device pool using independently generated PECs for each user.

Why is this better?
Let's look at the advantages of this solution. Each file creates it's own storage schedule that places blocks in at least a 64 bit number space. Each file has its own encryption key to encode the blocks, and each block is essentially anonymous. That means that an attacker is going to have to make, on average, 263 guesses to find each block against hundreds of storage devices to crack a file. And yet each file must be independently attacked as the schedules are different. That's many orders of magnitude more security than anything out there. The server side does not require a high degree of physical security -- hacking a storage device is not enough to break a file.

Now consider load balancing. The storage schedules are calculated to ensure equal probabilities of block ids and storage server ids. Each storage device will continue to accept requests to store blocks until it is filled. This means that file storage is balanced perfectly across all availabe resources -- without the need to redistribute storage.

We store each block three times on three different devices. If a storage device fails, there are two other devices identified in the schedule who can serve up that block instead. We have implemented a rebalancing mechanism that efficiently allows the failed server to be replaced and populated with the blocks that were lost to ensure the population of three redundant blocks remains. From hardware point of view, this costs no more that than conventional tiered solution as we do no need a disk-to-disk or tape layer to maintain data. Removing the need for a backup process saves on HR costs.

Each user can possess many PECs defining many independent storage drives. We store their collection of PECs in a Credential Reqistry using the same file storage technology. Key management is therefore just a matter of maintaining the credentials of this registry for each user. We do this through a series of Challenge Questions which the user must answer to gain access to their Credential Registry, but we do this without storing those answers. The user must hang on to those answers. We simply do not have them to release should we be approached by the courts.

The storage devices do not need to be owned by one business entity. They can be operated collectively as long as Service Level Agreements are in place to protect the end user from prolonged outages of a large number of devices. An obvious source of these resources are the Cloud Computing providers. Esotera is essentially the interface between the raw and unsecured storage of the Cloud and the high-level storage service offered to the end user.