Skip to content

Search encrypted data

Building an index is required to search an encrypted database without having to decrypt it entirely. However, this index can also contain sensitive data and thus need to be encrypted before it can be outsourced to a Zero-Trust such as a public cloud.

Cosmian Findex is a Searchable Encryption scheme that allows building and securely retrieving information from an encrypted index.

sse
Figure 1: sse

A detailed functional description of Findex is available in its functional documentation.

Implementation

The Findex library provides a Rust implementation of the Findex algorithm. This implementation is storage-technology agnostic. This means that in order for Findex to use a given storage (e.g. a database), all it needs is a Rust implementation of the Findex DB interface on top of this storage.

The Cloudproof Rust library provides:

  • implementations of the Findex DB interface for some popular databases;
  • a wrapper that allows easily instantiating Findex using a selected DB interface;
  • bindings to other languages (FFI/WASM/Python).

The other Cloudproof libraries build easy-to-use abstractions on top of these bindings to allow users to use the Findex API and to implement a custom Findex DB interface from languages such as JavaScript, Python or Java. See the section on DB interface implementation to know more about how to code, select and instantiate Findex with a given DB interface.

All those libraries are open-source and contain extensive tests. It is highly recommended to start by hacking some of those tests.

Using Findex using the REST DB interface

The REST DB interface relies on a DB server implementing the Findex REST API.

Setup

The library is published on NPM: simply install the library in your package.json.

npm install cloudproof_js

The library is based on WASM for speed.

TODO: update version after release

You need Java 8+ and Maven version 3+. The Cloudproof Java library is deployed on Maven central. Add the library to your POM:

<dependency>
  <groupId>com.cosmian</groupId>
  <artifactId>cloudproof_java</artifactId>
  <version>7.0.0</version>
  <type>jar</type>
</dependency>

Cloudproof Python is available on PyPI:

pip install cloudproof_py

The library is published on pub.dev as a Dart library: https://pub.dev/packages/cloudproof. To install it:

flutter pub add cloudproof

The library contains cryptographic native libraries called through FFI functions.

Instantiating Findex

Findex relies on the user to provide it with:

  • the private key;
  • the public label.

These are used to encrypt the index before it is sent to the DB for storage. See the security documentation to know more about the cryptographic algorithms used.

The REST DB interface requires an additional parameter: the server URL. A different server can be used for the two Findex tables. If only one URL is provided, it is used for both tables.

The complete example can be found in the JavaScript tests.

import { Findex, } from "cloudproof_js"

const key = randomBytes(16)
const label = 'hello, world!'
const url = 'http://localhost:8080'

const findex = new Findex(key, label)

// Instantiating an interface is performed in a dedicated step in JS since
// it is an asynchronous one.
await findex.instantiateRestInterface(serverUrl)

The complete example can be found in the Java tests.

// Dependencies:
//
// import java.security.SecureRandom;
// import com.cosmian.jna.findex.Findex;

byte[] key = new byte[16];
SecureRandom rng = new SecureRandom();
rng.nextBytes(key);

String label = "hello, world!";

String url = "http://localhost:8080";

Findex findex = new Findex(key, label, url);

The complete example can be found in the Python tests.

from cloudproof_findex import (Findex, Key)

key = Key.random()
label = 'hello, world!'
url = 'http://localhost:8080'

findex = Findex.new_with_rest_interface(key, label, url)

The REST DB interface is not (yet) available in Flutter. Therefore the SQLite is used here (see the complete test).

const dbPath = "./build/sqlite2.db";
await initDb(dbPath);
final key = base64Decode("6hb1TznoNQFvCWisGWajkA==");
SqliteFindex.init(dbPath, key, "Some Label");

Adding and deleting associations to and from the index

Adding associations to the index

Populating the index is done using the add API. It takes as argument a list of associations that map values to sets of keywords. Each value passed as input can then be retrieved by searching for any associated keyword.

This API returns the keywords that have been added to the index (meaning that no value where associated to these before).

The following snippet indexes the data 1337 under the keywords “John” and “Doe”.

await findex.add([
  {
    indexedValue: Location.fromNumber(1337),
    keywords: ["John", "Doe"],
  }
])
HashMap<IndexedValue, Set<Keyword>> indexedValuesAndWords =
    new HashMap<>();

indexedValuesAndWords.put(
    new Location(1337).toIndexedValue(),
    new HashSet<>(Arrays.asList(new Keyword("John"),
                                new Keyword("Doe")))
);

KeywordSet res = findex.add(indexedValuesAndWords);
res = findex.add({ Location.from_int(1337): ['John', 'Doe'] })

It is also possible to index keywords under other keywords. Indexing keywords under other keywords can be seen as building a graph where nodes represent keywords and contain the data associated to them and a directed vertex from a node “A” to a node “B” represents “B” being indexed under “A”.

In the following example the ID “0xA43C” represents John Doe. It is indexed under both “John” and “Doe” such that a search for either “John” or “Dow” would find the results indexed under this ID.

await findex.add([
  {
    indexedValue: Keyword.fromString("0xA43C"),
    keywords: ["John", "Doe"],
  }
])
HashMap<IndexedValue, Set<Keyword>> indexedValuesAndWords =
    new HashMap<>();

indexedValuesAndWords.put(
    new Keyword("0xA43C").toIndexedValue(),
    new HashSet<>(Arrays.asList(new Keyword("John"), new Keyword("Doe")))
);

KeywordSet res = findex.add(indexedValuesAndWords);
res = findex.add({ Keyword.from_string('0xA43C'): ['John', 'Doe'] })

Deleting associations from the index

Similarly to the add operation, the delete operation takes associations as argument. Instead of removing these associations from the index, it adds their negation. No association added before its negation was indexed can be found during a search.

The consequence is that the index size increases during this operation. The compact operation collapses negated associations, which allows deflating the index size.

Searching the index

The input of the search API is a set of keywords to search for, and an optional interrupt function.

It searches the index for all values indexed under the given keywords. If new keywords are found, the interrupt is called with all found values (both data and keywords). If it returns true, the search process is interrupted and the function returns with the data found so far, otherwise the index is searched for the keywords found in the previous search operation.

The result of the search is a map of the searched keywords to a set containing the data directly associated to them, the data associated to a keyword that is associated to them etc.

In the following example, the index is searched for “Mar”. When provided, the interrupt stops the search after more than 10 results have been found.

// Search without interruption.
const results = await findex.search(["Mar"])

// Search with interruption.
let searchCounter = 0
const filteredResults = await findex.search(
    ["Mar"],
    async (results: IntermediateSearchResults) => {
        searchCounter += results.total()
        return 10 <= searchCounter
    }
)
// Search without interruption.
SearchResults results = findex.search(new String[] {"Mar"});

// Search with interruption.
SearchResults filteredResults = findex.search(new String[] {"Mar"},
                                              new Interrupt() {
     int counter = 0;

    @Override
    public boolean interrupt(Map<Keyword, Set<IndexedValue>> results)
    throws CloudproofException {
        for (Set<IndexedValue> values: results.values()) {
            counter += values.size();
        }
        return (10 <= counter);
    }
});
# Search without interruption.
results = instance.search(['Mar'])

counter = 0
def interrupt(res: ProgressResults, counter: int) -> bool:
    for _, values in res.items():
        counter += len(values)
    return 10 <= counter

# Search with interruption.
filtered_results = instance.search(['Mar'],
                                   lambda res: interrupt(res, counter))

Compacting the Index

The compact operation allows to:

  • collapse negated associations;
  • increase the index density;
  • remove obsolete data;
  • remove keywords not associated to any value (which can happen if all of them where negated values or obsolete data);
  • reset any knowledge gained by an attacker watching index requests. This gives a lower bound on the frequency of the compact operation allowing to maintain the safety of index (see the Findex security documentation).

The compact operation takes as argument:

  • The new key and a new label used to encrypt the index. At least one of them needs to be different from the old key and old label.
  • The compacting rate, which is a float between 0 and 1 and determines how many compact operations (1/compacting_rate) would allow to compact the index entirely.
  • An optional filter that it is called on all the data found during the compact operation and allows to filter out data that does not need to be indexed back (only the data that is returned from the filter is present in the compacted version of the index).

In the following example, the index is entirely compacted in one operation using the old key and a new label, and the data Location(1337) is removed from the index.

The JavaScript binding for the compact operation is not available yet.

Set<Location> filteredLocations = new HashSet<Location>(Arrays.asList(new Location(1337)));

findex.compact(key, "new label", 1, new FilterLocations() {
    @Override
    public List<Location> filter(List<Location> locations)
    throws CloudproofException {
        return locations.stream()
                        .filter((Location l) -> filteredLocations.contains(l))
                        .collect(Collectors.toList());
        }
});
filtered_locations = { Location.from_int(1337) }

def filter(dataset: Set[Location]):
    res = set()
    for data in dataset:
        if data not in filtered_locations:
            res.add(data)
    return res

instance.compact(key, "new label", 1, filter)

The Flutter for the compact operation is not available yet.

Using Findex with other DB interfaces

The following figure describes the relationship between the user code, Cloudproof Rust, Findex and the DB interface.

+-----------------------------------------------------------------+
| +-----------------------------------------+                     |
| |  +-------+   +------+         +-----+   | //                  |
| |  | Redis |   | HTTP |   ...   | FFI |   | // DB interfaces    |
| |  +-------+   +------+         +-----+   | //                  |
| +-----------------------------------------+                     |
| |             Findex instance             |                     |
| +-----------------------------------------+                     |
|              ^                ^                                 |
|              |                |                                 |
|      +---------------+      Findex                              |
|      | Configuration |       API                                |
|      +---------------+        |                                 |
|              |                v                                 |
|           ------------------------                              |
|          /           |            \                             |
|     +-----+      +------+     +--------+     //                 |
|     | FFI |      | WASM |     | Python |     // User interface  |
|     +-----+      +------+     +--------+     //                 |
|                                                                 |
+-----------------------------------------------------------------+
|                         Cloudproof Rust                         |
+-----------------------------------------------------------------+
                                 ^
                                 |
                                 v
+-----------------------------------------------------------------+
|           Cloudproof Js/Python/Java/Flutter (optional)          |
+-----------------------------------------------------------------+
                                 ^
                                 |
                                 v
+-----------------------------------------------------------------+
|                            User Code                            |
+-----------------------------------------------------------------+

The user code calls Cloudproof Rust (directly or using a binding from one of the other Cloudproof libraries) to instantiate a Findex. It provides the configuration with the required information (which depends on the targeted DB interface). Cloudproof Rust instantiates the selected DB interface and a Findex object on top of it, which can then be used by the client code through the Findex API.

Available Rust implementations of the Findex DB interface

The following Findex DB interfaces are implemented in Cloudproof Rust:

  • REST: interface relying on a server implementing the Findex REST API. It is called by passing the URL of the server to the configuration. It can be identical for the Entry Table and the Chain Table.
  • SQLite: called passing a path for the Entry Table and Chain Table. The path can be identical since a dedicated SQL table (entry_table resp. chain_table) is created in the SQLite DB at the given path.
  • Redis: called passing a Redis URL for the Entry Table and the Chain table in the configuration. The URL can be identical since the tokens are prefixed with an identifier allowing to determine which DB they belong to.
  • Custom: each language that exports the Findex API offers the possibility to provide an custom DB interface implementation. It is called by passing the implementation of the Findex DB interface for the Entry Table and for the Chain Table. The following section describes in details how to implement this interface in different languages.

Implementing the Findex DB interface

The DB interface needs to be implemented for:

  • the Entry Table;
  • the Chain Table.

To understand why two tables are used, see the functional documentation. What is important to know here is that these two tables are simple key/value stores, and that the Entry Table keys and the Chain Table keys are collision free. This means that a single key/value store can be used for both of them, as long as the Entry Table keys can be retrieved for the compact operation.

The following section describes the requirement of each function from the Findex DB interface and the following table describes which function from the Findex DB interface (the columns) needs to be implemented in order to use a function from the Findex API (the rows).

          +-----------+-----------+-----------+-----------+---------------+
          | `fetch`   | `upsert`  | `insert`  | `delete`  | `dump_tokens` |
+---------+-----------+-----------+-----------+-----------+---------------+
| search  |  ET + CT  |           |           |           |               |
+---------+-----------+-----------+-----------+-----------+---------------+
| add     |  ET + CT  |     ET    |    CT     |           |               |
+---------+-----------+-----------+-----------+-----------+---------------+
| delete  |  ET + CT  |     ET    |    CT     |           |               |
+---------+-----------+-----------+-----------+-----------+---------------+
| compact |  ET + CT  |           |  ET + CT  |  ET + CT  |       ET      |
+---------+-----------+-----------+-----------+-----------+---------------+

Fetch

The fetch function takes as argument a list of keys. It returns the list of key/value pairs of each key for which a value was found. The keys that didn’t match any value should not be returned.

It is needs to be implemented for the both tables, for each Findex operation.

Implementation of the fetch function for an in-memory DB. The complete implementation can be found in the Cloudproof Js library.

const fetch= async (
  uids: Uint8Array[],
  table: Map<string, Uint8Array>,
): Promise<Index[]> => {
  const results = []
  for (const k of uids) {
    const v = table.get(k.toString())
    if (v !== undefined) {
      results.push(new Index(k, v))
    }
  }
  return results
}

See one of the Redis Entry Table and Redis Chain Table or the SQLite Entry Table SQLite Chain Table implementation in the Cloudproof Java library.

Implementation of the fetch function for an in-memory DB. The complete implementation can be found in the Cloudproof Rust library.

def fetch(uids, table):
    res = {}
    for k in uids:
        if k in table:
            res[k] = table[k]
    return res

Upsert

This is the most difficult function to implement since it is in charge of ensuring the concurrence safety of the index.

It takes as argument a list of old key/value pairs and a list of new key/value pairs. For each new key, it performs the following operation atomically:

If the current value stored under this key equals the old value given for this key, replace the current value with the new value. Otherwise, if the current value is not defined, return an error. Otherwise stack the current key/value pair for return.

The returned value is therefore the map of current key/value pairs for which the corresponding old key/value pair was different. If no conflict happened nothing should be returned.

It needs to be implemented for the add and delete Findex operations for the Entry Table only.

Implementation of the upsert function for an in-memory DB. The complete implementation can be found in the Cloudproof Js library.

const upsert = async (
  oldValues: UidsAndValues,
  newValues: UidsAndValues,
  table: Map<string, Uint8Array>,
): Promise<UidsAndValues> => {
  const rejected = [] as UidsAndValues

  // Add old values in a map to efficiently search for matching UIDs.
  const mapOfOldValues = new Map()
  for (const { uid, value } of oldValues) {
    mapOfOldValues.set(uid.toString(), value)
  }

  for (const { uid, value: newValue } of newValues) {
    const currentValue = table.get(uid.toString())

    if (currentValue?.toString() === mapOfOldValues.get(uid.toString())?.toString()) {
      table.set(uid.toString(), newValue)
    } else if (currentValue === undefined) {
      throw new Error(
        "Rust shouldn't send us an oldValue if the table never contained a value… (except if there is a compact between)",
      )
    } else {
      rejected.push({ uid, value: currentValue })
    }
  }
  return rejected
}

See one of the Redis Entry Table and Redis Chain Table or the SQLite Entry Table SQLite Chain Table implementation in the Cloudproof Java library.

Implementation of the upsert function for an in-memory DB. The complete implementation can be found in the Cloudproof Rust library.

def upsert(old_values: dict, new_values: dict, table: dict):
    res = {}
    for uid, new_value in new_values.items():
        current_value = entry_table.get(uid)
        old_value = old_values.get(uid)
        if old_value == current_value:
            entry_table[uid] = new_value
        elif not current_value:
            raise ValueError('current value for uid ' + uid
                + ' needs to be defined as long as the old value is defined')
        else:
            res[uid] = current_value
    return res

Insert

It takes as input a list of key/value pairs. It stores each pair and returns. An error should be returned if there already was a value for any given key.

It needs to be implemented for the add, delete and compact Findex operations for the Chain Table and for the compact operation only for the Entry Table.

Implementation of the insert function for an in-memory DB. The complete implementation can be found in the Cloudproof Js library.

const insert = async (
  links: UidsAndValues,
  table: Map<string, Uint8Array>,
): Promise<void> => {
  for (const { uid: newUid, value: newValue } of links) {
    table.set(newUid.toString(), newValue)
  }
}

See one of the Redis Entry Table and Redis Chain Table or the SQLite Entry Table SQLite Chain Table implementation in the Cloudproof Java library.

Implementation of the insert function for an in-memory DB. The complete implementation can be found in the Cloudproof Rust library.

def insert(items: dict, table):
    for uid, value in items.items():
        if uid in table:
            raise ValueError('collision on insert for UID: ' + uid)
        table[uid] = value

Delete

It takes as argument a list of keys. It removes all values stored for those keys from the DB.

It needs to be implemented for both tables for the compact Findex operation only.

The JavaScript binding for the compact operation is not available yet.

See one of the Redis Entry Table and Redis Chain Table or the SQLite Entry Table SQLite Chain Table implementation in the Cloudproof Java library.

def delete(uids, table):
    for uid in uids:
        table.pop(uid)

Dump tokens

It returns all keys stored in the Entry Table.

The JavaScript binding for the compact operation is not available yet.

See one of the Redis Entry Table and Redis Chain Table or the SQLite Entry Table SQLite Chain Table implementation in the Cloudproof Java library.

def dump_entry_tokens(table: dict):
    return table.keys()

Instantiating Findex with a custom DB interface

Given that the required functions have been implemented, Findex can then be instantiated with a custom DB interface.

```javascript await loadWasm()

const entry

const entryInterface = new DbInterface()
entryInterface.fetch = async (uids: Uint8Array[]) => {
    await fetch(uids, entryTable)
}
entryInterface.upsert = async (oldEntries: UidsAndValues,
                               newEntries: UidsAndValues) => {
    await upsert(oldEntries, newEntries, entryTable)
}

const chainInterface = new DbInterface()
chainInterface.fetch = async (uids: Uint8Array[]) => {
    await fetch(uids, chainTable)
}
chainInterface.insert = async (links: UidsAndValues) => {
    await insert(links, chainTable)
}


const findex = new Findex(key, label)
findex.instantiateCustomInterface(entryInterface, chainInterface)
```
Findex findex = new Findex(key,
                           label,
                           new SqliteEntryTable("/tmp/sqlite.db"),
                           new SqliteChainTable("/tmp/sqlite.db"));
entry_interface = PythonCallbacks.new()
entry_interface.set_fetch(lambda uids: fetch(uids, entry_table))
entry_interface.set_insert(lambda entries: insert(entries, entry_table))
entry_interface.set_delete(lambda entries: delete(entries, entry_table))
entry_interface.set_dump_tokens(lambda : dump_tokens(entry_table))
entry_interface.set_upsert(
    lambda old_entries, new_entries: upsert(old_entries, new_entries, entry_table))

chain_interface = PythonCallbacks.new()
chain_interface.set_fetch(lambda uids: fetch(uids, chain_table))
chain_interface.set_insert(lambda links: insert(links, chain_table))
chain_interface.set_delete(lambda links: delete(links, chain_table))

findex = Findex.new_with_custom_interface(key,
                                          label,
                                          entry_interface,
                                          chain_interface)

© Copyright 2018-2024 Cosmian. All rights reserved.