Self-hosted vs Cloud Tier Comparison

Featurepricing.comparison.columns.communityStarterProScale
AI Agents
User seats1Per seatup to 20up to 100
Agent countUnlimitedUnlimitedUnlimitedUnlimited
LLM providersBYOK (any)Managed onlyBYOK or managedBYOK or managed
Knowledge Hub
Knowledge source types6666
Tools & Extensions
Built-in tools166+166+166+166+
Sandboxed extensionsBrowsing · Scraping · Code InterpreterBrowsing · Scraping · Code InterpreterBrowsing · Scraping · Code InterpreterBrowsing · Scraping · Code Interpreter
Media Generation
Image generation
Video generationbasic
Voice (TTS)
Voice (STT)
Channel Distribution
Web chat
Embeddable widget
Slack/Discord botsself-keybasic
CAS (Smart Signage)CAS add-onCAS add-on
CAS (AI Concierge)CAS add-onCAS add-on
Workspace
Workspace integrationsbasic
Automated SNS repliesself-keybasic
Security & Compliance
PII encryption
Per-group dedicated encryption keyssinglesinglesingle
Deployment modelSelf-hostedCloud (shared)Cloud (dedicated)Cloud (dedicated HA)
SSO / SAMLadd-onincluded

AICLUDE Token Fee (ATF) tiered usage pricing

All API usage: LLM, image, video, voice, lip-sync: is converted to tokens and billed by ATF rates. The first 10M tokens are free, and per-token rates drop as you grow. Same rates for BYOK and managed keys.

Monthly volume tierPer 1M tokensNotes
0 – 10MFreeFree for new and test workloads
10M – 100M$0.50Standard production tier
100M – 1B$0.35Auto growth discount
1B+$0.25Enterprise volume

Add-ons, only what you need

Standalone options unbundled from tiers. Activate or deactivate anytime, pro-rated.

SSO / SAML / OIDC

$99/month

Okta, Azure AD, Google Workspace, or custom IdP. Group mapping and SCIM auto-provisioning included.

Add-on for Pro · included in Scale and Enterprise

Dedicated GPU Worker

$599/month

Dedicated NVIDIA L4 24GB GPU + 16 vCPU / 64 GB RAM. Bypass shared queue, 0s cold start, prevents media bursts contention.

Add-on for Pro and Scale

Extra storage

$0.10/GB/month

Beyond the 100 GB/seat default. For knowledge bases, media, and archival.

All Cloud tiers

Pricing FAQ

How does the tiered AICLUDE Token Fee (ATF) work?

Monthly usage is automatically bucketed and charged at the matching tier rate. The first 10M tokens are free, 10M to 100M is $0.50 per 1M tokens, 100M to 1B is $0.35, and 1B and above is $0.25. With BYOK you pay your LLM vendor directly and AICLUDE only charges ATF. With managed keys you're billed transparently for the model cost plus ATF. Example: 50M tokens per month means ATF = (10M free) + (40M × $0.50) = $20.

Do I have to bring my own LLM keys?

Starter uses AICLUDE-managed LLM keys only (curated vendors, pay-as-you-go pass-through, zero setup). Pro and above let you choose between BYOK and AICLUDE-managed keys. Either way the pipeline-level LLM router works the same way and lets you assign different LLMs to the input, understanding, and generation stages. Supported vendors: Google, OpenAI, Anthropic, xAI, AWS Bedrock, Azure OpenAI, self-hosted vLLM, and any OpenAI-compatible endpoint (BYOK).

Cloud vs Self-hosted, when do I choose which?

Choose Cloud when you want zero ops with physical tenant isolation and no infrastructure to manage. Choose Self-hosted when compliance, data residency, or air-gap requirements demand on-premise deployment. Both use the same Docker images and the same features. Only the deployment model differs.

How is the Personal edition's fair-use limit enforced?

Personal is a free commercial license, not open source. The 1-seat limit is enforced in code. A periodic license check-in (7-day offline grace) reports usage, and over-threshold installs receive an automatic Starter upgrade prompt. Fair-use terms: single user, non-commercial or annual revenue under $1M. SSO, audit logs, per-group dedicated encryption keys, and multi-seat are Enterprise-only.

Why don't Pro and Scale cap user counts?

The Pro and Scale base fees recover the dedicated tenant infrastructure cost (compute, storage, HA). When users grow, LLM usage grows in proportion and is automatically settled by tiered ATF. All media generation (image, video, voice, lip-sync) flows through the same ATF system. Charging again per user would be double-counting, so we kept it simple: base fee + ATF metering. Starter uses a shared tenant pool, so it stays per-seat ($29/seat) with no seat cap: as your team grows, switching to Pro (dedicated tenant flat fee) becomes the natural choice. Usage is tallied automatically from API metadata. Real-time dashboards in the admin console break down by group, user, and agent, with configurable alerts and optional hard caps.

How do I upgrade or migrate between tiers?

Upgrades within Cloud (Starter → Pro → Scale) are instant and pro-rated. Cloud and Self-hosted migration is a managed process: we export your tenant data and deliver encrypted Docker images to your infrastructure, and vice versa.

What is CLU Agent Station (CAS)?

A platform for building signage and AI concierge agents and registering their content on devices. It runs in the browser, so devices can be anything: mobile (9:16), tablets (16:9), PCs, or the web. Pricing is based on agent instances. Starter allows up to 5 signage + concierge agents combined at $19/agent. Pro and Scale allow unlimited agents and unlimited devices. Agent-generated content (image, video, lip-sync) is metered through ATF and media usage, so there is no clip cap. Optional hardware bundles (display, camera, microphone) are quoted separately.

Do you support GPU workers for self-hosted LLMs and media generation?

Yes. AICLUDE ships a modular GPU Worker stack: self-hosted LLM daemons (multimodal chat), image, music, and video generation, and lip-sync. All media generation is settled through the same tiered ATF system: no separate media billing. STT and TTS run on our self-hosted ONNX models and are included in every tier at no extra cost. If you need dedicated GPU, the dedicated GPU Worker add-on ($599/month) gives you a dedicated NVIDIA L4 24GB GPU with 16 vCPU / 64 GB RAM. Self-hosted Enterprise ships the full GPU Worker stack, and Personal can run any of these on your own GPU hardware.

How does the annual discount work?

Choosing annual billing applies a 17% discount (equivalent to 2 months free) to every Cloud tier and is billed annually. CAP: Pro at $3,990/year ($332/mo equiv.), Scale at $8,990/year ($747/mo equiv.). CAS: Starter at $190/agent/year ($16/mo equiv.), Pro at $2,990/year ($249/mo equiv.), Scale at $5,990/year ($497/mo equiv.). Annual plans are non-refundable; service continues uninterrupted through the end of the paid term.

When can I activate or deactivate add-ons?

Anytime from the admin console, pro-rated. Available add-ons: SSO/SAML ($99/mo), dedicated GPU Worker ($599/mo), extra storage ($0.10/GB/mo). SSO/SAML is included by default in Scale and Enterprise. Audit logs are included on every Cloud tier and stored automatically in S3 (Pro: 90-day retention, Scale and Enterprise: unlimited). Multi-region failover is included by default in Scale and Enterprise.

What's the difference between a station and a device?

A **station** is the billing unit. One signage or one AI concierge running at a store equals 1 station. A store running both signage and concierge counts as 2 stations. A **device** is the screen that actually displays the station: mobile (9:16), tablets (16:9), PCs, or web browsers, unlimited. You can mirror one station to many devices (entrance signage + mobile sync, for example) at no extra cost. If you exceed the cap (Starter 5 / Pro 10 / Scale 15), the console offers an automatic upgrade. Large chains, airports, and public sites that need more than 15 stations should go to Enterprise (contact us). Usage (LLM, image, video, lip-sync) is settled by tiered ATF, so there is no per-station limit on content generation.

Ready to start?

Jump into CLU and ship agents to your team right away.