Skip to content

Deploy AI Runner

A Cosmian VM with a pre-installed API to serve language models for summarization and translation tasks.

This instance can be deployed on virtual machines that supports AMD SEV-SNP or Intel TDX technologies.

Please first read the guide about how to setup a Cosmian VM.

The following steps can help one to deploy its own instance on each available cloud provider.

Please first read the guide about how to setup a Cosmian VM.

Deploy Cosmian VM AI on a cloud provider 🚚

Go the Cosmian marketplace webpage of the chosen cloud provider.

Select an OS and continue until the Cosmian VM AI instance is spawned.

Here’s the list of instance types by cloud provider

Cloud provider Azure GCP AWS
AMD SNP SNP SNP
Standard_DCas_v5 n2d-standard M6a
Standard_DCads_v5 C6a
R6a
Intel TDX TDX TDX
DCes_v5-series c3-standard Not available
ECesv5-series
(preview)

The Cosmian VM AI Runner contains:

  • a ready-to-go Nginx setup (listening on port 443 and locally on port 5001)
  • a AI Runner service which is ready but not started yet (needs a valid configuration to start)
  • the Cosmian VM software stack. As reminder, Cosmian VM Agent is listening on port 5555.

At the first start of the VM, Cosmian AI Runner is not configured by default. Indeed AI Runner configuration can potentially contain secrets that need to be protected. That is why the configuration MUST be sent remotely and securely to the VM using the Cosmian VM CLI see app init.

Service

Systemd is used to initialize and run the AI runner and the Cosmian VM agent.

See the running services with the following command:

systemctl status cosmian_ai_runner
systemctl status cosmian_vm_agent

You can read as well full logs using:

journalctl -u cosmian_ai_runner
journalctl -u cosmian_vm_agent

Configure the AI Runner 📜

As explained previously, it is safe to provide secrets (such as passwords) in the configuration file because this file is going to be stored in the encrypted folder (LUKS) of the Cosmian AI Runner.

By default Facebook models are used to summarize and translate. No authentication is provided by default.

It should be configured with the right Identity Provider application (client ID and jwks URI), to authenticate users.

config.json on local machine
{
    "summary": {
        "default":  {
            "model_name": "facebook/bart-large-cnn",
            "generation_config": {
                "max_length": 140,
                "min_length": 30
            }
        }
    },
    "translation": {
        "model_name": "facebook/nllb-200-distilled-600M",
        "generation_config": {
            "max_length": 200
        }
    }
}

Identity Provider application can be configured as follow:

config.json on local machine
{
  "auth": {
    "openid_configs": [
      {
        "client_id": "XXXX",
        "jwks_uri": "XXXX"
      }
    ]
  },
  "summary": {
    // no changes required here
  },
  "translation": {
    // no changes required here
  }
}

More details about the config file can be found here.

Use Cosmian VM CLI to send securely the new AI Runner configuration

Cosmian VM CLI has to be installed on the client machine (Ubuntu, RHEL or via Docker)

Download the binary and allow it to be executed:

On the local machine
$ sudo apt update && sudo apt install -y wget
$ wget https://package.cosmian.com/cosmian_vm/1.2.5/ubuntu-22.04/cosmian-vm_1.2.5-1_amd64.deb
$ sudo apt install ./cosmian-vm_1.2.5-1_amd64.deb
$ cosmian_vm --version

Download the binary and allow it to be executed:

On the local machine
$ sudo apt update && sudo apt install -y wget
$ wget https://package.cosmian.com/cosmian_vm/1.2.5/ubuntu-24.04/cosmian-vm_1.2.5-1_amd64.deb
$ sudo apt install ./cosmian-vm_1.2.5-1_amd64.deb
$ cosmian_vm --version

Download the binary and allow it to be executed:

On the local machine
$ sudo dnf update && dnf install -y wget
$ wget https://package.cosmian.com/cosmian_vm/1.2.5/rhel9/cosmian_vm-1.2.5-1.x86_64.rpm
$ sudo dnf install ./cosmian_vm-1.2.5-1.x86_64.rpm
$ cosmian_vm --version

Start a Ubuntu-based Docker container and enter it:

On the local machine
$ docker run -it ubuntu:22.04 /bin/bash

Download the binary and allow it to be executed:

In Docker container (local machine)
$ apt update && apt install -y wget
$ wget https://package.cosmian.com/cosmian_vm/1.2.5/ubuntu-22.04/cosmian-vm_1.2.5-1_amd64.deb
$ apt install ./cosmian-vm_1.2.5-1_amd64.deb

Deploy the configuration and starts the Cosmian AI Runner

On the local machine
cosmian_vm --url https://${COSMIAN_VM_IP_ADDR}:5555 \
           --allow-insecure-tls \
           app init -c config.json

This command will send via an encrypted tunnel the configuration that will be written in the remotely path /var/lib/cosmian_vm/data/app.conf which is contained in an encrypted container (LUKS).

Check the connection with the AI Runner

curl --insecure https://${COSMIAN_VM_IP_ADDR}/health

Why --allow-insecure-tls and --insecure flags?

When the agent starts (see Snapshot the VM) self-signed certificate is created to enable HTTPS out of the box.

These certificates must be replaced by trusted ones using tools like cosmian_certtool or Linux tools (certbot with Let’s Encrypt for instance).

See how to setup trusted certificates.

Snapshot the VM 📸

Once the VM is configured as needed, Cosmian VM Agent can do a snapshot of the VM containing fingerprint of the executables and metadata related to TEE and TPM.

The agent creates an encrypted folder (LUKS container) to store sensitive information, creates self-signed certificate for Nginx and starts a snapshot.

Wait for the agent to initialize the LUKS and generate the certificates. This is automatically at boot.

Verify the Cosmian VM AI Runner integrity ✅

Verifying trustworthiness of the Cosmian VM AI Runner is exactly the same process as verifying the Cosmian VM itself.

© Copyright 2018-2024 Cosmian. All rights reserved.