Privacy & Data - Vellum Docs

Your assistant's data lives wherever your assistant is hosted. For self-hosted installations, that means everything stays on your machine. For managed (cloud-hosted) deployments, your data stays within that environment.

Regardless of hosting option, your assistant thinks through an AI model in the cloud. Here's exactly what that means.

What lives in your assistant's workspace

All of the following data stays within your assistant's environment and never leaves it — except where noted. For self-hosted installations, this means it stays on your machine.

Never leaves your assistant

Credentials

API keys, OAuth tokens

~/.vellum/protected/

Isolated in a separate process (CES) — never exposed to the assistant or AI model

Trust rules & permissions

~/.vellum/protected/trust.json

Custom skills

~/.vellum/workspace/skills/

Configuration

~/.vellum/workspace/config.json

Included in AI model calls when relevant

Workspace files

SOUL.md, USER.md, IDENTITY.md, NOW.md

~/.vellum/workspace/

Loaded at the start of each conversation as context

Memories

Facts, preferences, decisions

~/.vellum/workspace/data/

Only when relevant to a conversation

Conversation history

Messages, tool calls, results

~/.vellum/workspace/data/

Current conversation only — compacted when long

What leaves your assistant

Data leaves your assistant's environment in two ways:

AI model calls — conversation context (messages, workspace files, relevant memories) is sent to the AI model provider for inference. Currently, this is Anthropic (Claude) with a zero-retention data policy.
Channel messages — when your assistant sends a message through Telegram, Slack, email, or phone, that message content passes through the respective platform's servers.
Tool network calls — when a tool makes an external API request (e.g., web search, fetching a URL), that request leaves your environment. Credentials for these requests are injected by a proxy at the network layer — the assistant never sees the raw credential values.

If you are self-hosting and have opted out of usage analytics, your workspace files, memories, conversation history, credentials, and trust rules are never sent to Vellum — the only external party that receives conversation context is the AI model provider. The exception is the Share Feedback feature, which explicitly sends logs when you choose to use it. If you use Vellum's managed platform, your data is stored in Vellum's infrastructure — isolated in a dedicated container, but accessible to Vellum for operational purposes.

How credentials are protected

Credentials (API keys, OAuth tokens, passwords) are handled by the Credential Execution Service (CES) — a separate process that runs alongside your assistant. The assistant communicates with CES over a local RPC channel but never sees the credential values themselves.

Storage — credentials are encrypted at rest in ~/.vellum/protected/
Network injection — when a tool needs to make an authenticated API call, the credential proxy injects the token into the outbound HTTP request at the network layer. The assistant only knows which credential to use (by alias), not what it contains.
Secure collection — the credential_store tool collects secrets through a dedicated UI prompt. Secret values never enter the conversation text.
Deployment modes — CES runs as a local RPC process (desktop/CLI) or as an HTTP sidecar (containerized deployments). The security model is the same in both cases.

How secrets are caught

Even with secure collection, secrets can accidentally end up in conversation text — pasted from a clipboard, included in a file, or returned in a tool output. A multi-layer detection pipeline catches these:

Ingress scanning — inbound messages are checked for known secret patterns before they reach the assistant. Detected secrets are blocked or redacted.
Pattern matching — regex-based detectors for common secret formats: API keys, tokens, connection strings, private keys, and more.
Allowlist — known false positives can be allowlisted so they don't trigger the scanner.
Tool output scanning — secrets in tool execution results are detected and handled before they enter the conversation context.

The secret scanner is a safety net, not a guarantee. The primary defense is the credential architecture: secrets are collected securely and injected at the network layer, so they should never appear in conversation text in the first place.

Who can access your data

Your assistant distinguishes between three levels of access:

Guardian (you) — full access to everything: memories, workspace files, credentials, tools, configuration. Your identity is tied to the desktop app and verified automatically.
Trusted contacts — verified people who message your assistant through external channels (Telegram, Slack, email). They can have conversations and use allowed tools, but they can't access your memories, read your workspace files, or use sensitive tools without guardian approval.
Unknown contacts — unverified users who reach your assistant on a channel. They go through a challenge-response flow (invite code) before being granted any access.

Private conversations

A private conversation has its own isolated memory scope. Memories created during a private conversation can't leak into other conversations. The private conversation can still read from your shared memory pool, but anything it saves stays scoped to that conversation. Useful when discussing sensitive topics you don't want persisted broadly.

Computer use safety

When your assistant controls your Mac through computer use, it follows a structured perceive → verify → execute → observe loop:

Screenshots are captured and analyzed to understand the current state
Each action (click, type, key press, drag, AppleScript) is a separate tool call with its own risk assessment
Before interacting, the assistant verifies it's looking at the right element
After each action, it observes the result before continuing

Computer use tools are classified as medium or high risk, meaning they prompt for your approval. You can create persistent trust rules for common workflows (e.g., “always allow clicking in VS Code”) to reduce friction while maintaining control.

The permission system

Every tool has a risk classification:

Low risk — runs automatically. Reading files, searching memory, generating text.
Medium risk — prompts for approval. Writing files, running shell commands, sending messages.
High risk — always prompts. Deleting data, modifying system settings, computer use actions.

When a tool prompts for approval, you can:

Allow once
Allow for 10 minutes
Allow for the conversation
Always allow (creates a persistent trust rule)
Deny

Trust rules are stored in ~/.vellum/protected/trust.json and accumulate over time as you use your assistant. Each rule has a tool, pattern, scope, decision, and priority. On external channels where there's no native prompt UI, approvals are handled through interactive buttons (Telegram, Slack) or the guardian system.

The AI model provider

Your assistant currently uses Anthropic's Claude as its primary AI model. Anthropic operates under a zero-retention data policy for API usage — conversation data sent for inference is not stored, logged, or used for training. The architecture supports other model providers as well.

What gets sent to the model on each turn:

The system prompt (assembled from your workspace files)
Conversation history (current conversation, possibly compacted)
Relevant memories (retrieved via hybrid search)
Tool definitions (available tools and their schemas)
Tool results (output from recently executed tools)

Credentials, trust rules, and raw configuration files are never included in model calls.

Your options for sensitive information

Use private conversations — memory from private conversations is scoped and won't leak into future contexts.
Never paste secrets in chat — use credential_store with the secure prompt action instead. The value never enters the conversation.
Review workspace files — check USER.md and SOUL.md periodically. If something sensitive got saved there, remove it. These files are sent to the model on every conversation.
Manage memories — ask your assistant to delete specific memories, or review them with “show me what you remember about X.” You have full control over what stays in memory.
Scope trust rules carefully — avoid broad “always allow” rules for high-risk tools. Use conversation-scoped or time-limited approvals instead.

Hosting options

Self-Hosted (Local)

Your assistant runs on your machine. All data stays local. Conversations, memories, workspace files, credentials — none of it leaves your device except for AI model inference and outbound channel messages. You manage updates, uptime, and backups.

Platform Managed

Your assistant runs in Vellum's cloud infrastructure. Same capabilities, same security model. Your data is isolated in a dedicated container — not shared with other users. The benefit is 24/7 uptime, automatic updates, and no local machine dependency. Schedules, watchers, and heartbeats run reliably even when your desktop is off.