Use Cases and Benefits
A solution to secure large repositories in the cloud¶
Confidential Data Access is a set of APIs that provide a fast, secure application level encryption scheme meant to store and index large data repositories in a Zero Trust environment (typically, a Public Cloud).
Current solutions are too basic¶
Existing solutions usually provide a basic system level symmetric encryption scheme under a single key. The security model of this scheme is very limited for large repositories: leaking the key, leaks the whole repository, incurring large data loss and the need to re-encrypt the full repository.
Since the scheme is symmetric, the same key needs to be replicated on all systems needing either to encrypt or to decrypt data. This replication increases the risks of leaking the key, and demands strong security on both the encrypting and the decrypting systems, which are usually separate systems in large infrastructures, such as Big Data infrastructures.
Improvement to the single symmetric key scheme have been developed on top of end to end encryption systems. Each partition of data is encrypted under a different symmetric key, improving the security model. However, these systems still suffer from the use of replicated symmetric keys and makes key management extremely complex when allowing applications and users to encrypt or to decrypt certain sets of partitions, in particular when these sets overlap.
None of these schemes provide ways of quickly and confidentially searching data across the partitions.
There is a better security model with added functionalities¶
Cosmian Confidential data access combines 2 encryption primitives in simple APIs that significantly reduce the security risks and key management of encrypting and indexing large repositories. A summary of benefits is available right after the description of the two schemes and the use cases
Encryption using attributes¶
The first scheme is a public key encryption scheme that allows embedding attributes inside the cipher text providing fine grained partitioning of the encrypted data.
Consider the following 2 policy axes,
Country according to which data is partitioned:
Each pair (
Country) constitutes one of the 4x4=16 data partitions.
With Cosmian attribute-based encryption scheme, the encryption key is public and can only encrypt: encrypting systems(Spark, data engineering applications, ETLs, etc…) do not have to be secured and can directly hold the key, relaxing constraints on the infrastructure. The public key can encrypt with any set of attributes from the policy.
Each user has its own unique key even though partitions overlap:
1 can decrypt all the
France data with the following access policy
2 can decrypt all the
Sales data with the following access policy
3 can decrypt the
Sales data from
Germany with the following access policy
As an additional security benefit, user keys are truly unique: even though two users have the same access policy, their key fingerprints will be different. This makes it much easier for forensic cyber teams to trace a key leak.
Policy axes can be hierarchical. Suppose three levels on a confidentiality axis:
Top Secret. This hierarchical axis will let users with a
Top Secret attribute in their key access policy decrypt
Top Secret data – whereas users with a
Medium attribute will only be able to decrypt
Finally, attributes can be rotated providing forward secrecy on selected partitions only.
In addition to encrypting the data with attributes, Cosmian Confidential Data Access librairies offer the ability to create encrypted indexes. These indexes will match an encrypted word to an encrypted database uid or to an encrypted file name.
Encrypted indexes have the following characteristics:
- all data in the indexes are encrypted
- queries to the index are encrypted
- answers to queries are encrypted
Since the server never learns anything about the content, the queries or the answers, so the index can be safely stored in a zero trust environment, next to the indexed data. Typically, these indexes are stored in fast scalable key-value stores deployed in the cloud.
The combination of the two primitives provide a complete solution to building a large repository of data which
- can be entirely stored in a zero trust environment (e.g. the public cloud - attributes encryption is agnostic to the storage technology).
- is quickly and securely searched and extracted
- while user access to the data is controlled by the access policies in user decryption keys.
Typical uses cases are the secure storage and secure indexing in the cloud of:
- large transactional databases (e.g. banking transactions)
- large directories (e.g. employees directories)
Summary of benefits¶
Better security through partitioning: leaking a decryption key only gives access to the partition(s) this key can decrypt.
Encryption is performed using a public key, which cannot decrypt and can therefore be safely deployed to all encrypting systems: Encrypting systems do not need to be secured.
The crypto-system allows issuing user decryption keys for overlapping sets of partitions over multiple axes, for sophisticated, fine-grained user access policies.
User decryptions keys can be issued at any time after data is encrypted, for any given set of partitions. This facilitates user key management and does not require exhaustively listing all possible usages before partitioning (a typical data science use case: it is hard to preempt what analysis will be conducted).
User decryption keys are truly unique even if they have the same access policy for better tracing of potential security breaches.
Policy attributes can be rotated, providing forward secrecy for designated partitions without re-encrypting the entire database.
Adding secure indexing, data can be confidentially and quickly searched and retrieved. The environment does not learn anything about the queries nor the content.
The secure indexes can be safely stored in a zero trust environment, next to the encrypted data.