CAS

CLU Agent Station

An AI concierge and signage solution for your customers.

Detect passersby with a camera, respond in real-time voice, render lip-synced avatars and run signage: all on one device.

STTS · real-timeBrowser-native avatarPedestrian analytics5+ languagesOffline cache
LIVE DEMO

One web build. Mobile, tablet, and desktop.

The concierge auto-fits every form factor from a single web build. The same URL on mobile, tablet, or desktop returns the same session and the same answer.

Core capabilities

An AI concierge that works the moment you plug it in.

Set it on site and it handles detection, conversation, signage and reporting: automatically.

Pedestrian auto-detection

Camera detects approaching visitors in real time and greets them with attribute-aware messages.

Real-time STTS voice

Mic input flows straight through to spoken response without breaks, keeping conversational turns natural.

Browser-native avatar

Runs in any web browser without a dedicated GPU server. Mouth shapes and expressions blend with the voice so the avatar responds like a person.

Signage video playback

Seamless video loops and a news ticker run alongside the concierge on the same screen.

Attention & frontal gaze

Per-content dwell time and frontal-gaze ratios are measured automatically and turned into KPIs.

Weekly PDF reports

Footfall, conversion, top content: published as a PDF every week without lifting a finger.

Architecture

Two tracks, one device, running in parallel.

The concierge loop and the signage + analytics track operate independently on the same screen.

Track AAI Concierge · real-time voice
Detect
Camera pedestrian recognition
Listen
Mic + streaming STT
Think
Agent + RAG
Speak
TTS + Browser-native avatar
loop ↻
Track BSignage + Analytics
Signage content
Signage video + ticker
Attention analytics
Dwell · gaze → weekly PDF

When the network drops the IndexedDB cache keeps the signage loop and greeting clips alive.

Self-hosted Stack

Self-hosted Docker stack

One station = one container set. Drops onto an in-store PC or edge box without changes.

Docker Containers
Signage device
cas-feexternal
Frontend
Nginx static assets, the entry point that renders the signage.
cas-be
API server
API server: devices, licenses, sessions, content.
clue
CLUE · Knowledge Engine
Powered by CLUE, our own knowledge engine. Devices, content and event logs live in one engine.
seaweedfs
SeaweedFS
Signage video and avatar clip storage.
aiclude-te
Tool Executor
Gateway to STTS, avatar, embedding and vision workers.
TE internal workerscas-chatcas-avatar
  • embedding
    3.2 GB
  • detect
    250 MB
  • sttavatar
    1.1 GB
  • ttsavatar
    1 GB
  • liveportraitavatar
    800 MB
llm-router
Multi-LLM Router
Routes each agent step to the best LLM automatically.
External LLM endpoints
  • Commercial LLM API
  • Self-hosted LLM
  • Local Inference
External integrations (self-host or managed)
1 station → N displays (no extra fee)
  • Minimal (cas-chat)
    vCPU
    2 vCPU
    Memory
    4 GB
    Disk
    30 GB SSD

    Text only. Excludes STT, TTS and avatar daemons.

  • Standard (cas-avatar)Recommended
    vCPU
    4 vCPU
    Memory
    16 GB
    Disk
    50 GB SSD

    Voice + avatar full set. 5 to 10 devices.

  • Large (multi-device)
    vCPU
    8 vCPU
    Memory
    32 GB
    Disk
    100 GB SSD

    20 to 50 devices, ~100 concurrent conversations.

  • One station can drive any number of displays (signage, tablet, web, mobile) with no extra fee.
  • Offline cache keeps the core experience alive for up to 72 hours of license grace.
  • Real-time STTS runs without a GPU on the standard tier.
Use cases

Wherever your front line is.

Already proven at a Shikoku retail store: 6.4M passersby, 3,709 conversations, 66.4% engagement.

Retail concierge

Recognize visitors and present events, stock and discounts in their language.

66.4% engagement

Airports & hotels

Departure gates, check-in counters, shuttle times: voiced and shown on signage at once.

5+ languages

Museum docent

Per-exhibit docent modes, multilingual narration, automatic kid / tourist mode switching.

+30% visitor satisfaction

Hospitals & public reception

Reception, floor guidance and queue updates over voice and signage cuts staff load.

Tier-1 self-service
Why CAS

One station = concierge + signage + analytics.

Real-time voice
Edge STT, streaming LLM and pre-cached TTS keep conversational turns natural.
Browser-native avatar
Five languages (KO, JA, EN, ZH, RU) render in the user's browser without a GPU server, so the avatar feels human in any environment.
Real-world traction
Pikara Shop, 4 months: 6.4M passersby · 3,709 conversations · 66.4% engagement.
Offline-capable
IndexedDB cache + 72h license grace keep the core experience alive when the network drops.
Unlimited displays per station
Run one station on signage, tablet, web or phone: add as many displays as you like, no extra fee.
Field-ready provisioning
Issue a Station token once for 180-day auto-auth; revoke instantly from the console if a unit is swapped.
Deploy in the field

Start with one AI concierge.

Pricing is per station: every additional display you bring up runs the same station for free.