← Back to Arqivon

Arqivon — System Architecture

Real-Time Multimodal Live Agent  ·  Gemini Live Agent Challenge
FLUTTER MOBILE APP
📷 Camera Capture
1fps JPEG frames · Base64 encoded
🎙️ Microphone
16kHz PCM · mono · echo cancel
🔊 Audio Player
24kHz WAV playback · gapless queue
🎨 Mode UI
Smart Cards · Overlays · PDF Export
Assistant Translator Tutor Support
WebSocket Client
Auto-reconnect · 5 retries · exp. backoff
📦 Riverpod State
LiveSessionNotifier · reactive UI
GOOGLE CLOUD RUN
🚀 FastAPI WebSocket Relay
Bidirectional bridge · min 1 instance · 1hr timeout · CPU always-on
🔐 Auth Middleware
Firebase token verification
🧠 Mode Router
4 system prompts · per-mode tool sets
📊 Latency Tracer
P50 / P95 / P99 every 30s
🔄 Input Queue
Priority audio · latest-wins video
🛠️ Tool Registry
17 agentic tools dispatched server-side 🔧 17 tools
GOOGLE GEMINI
Gemini 2.5 Flash
Live API · Native Audio
aio.live.connect()
🗣️ Native VAD
Barge-in detection
Real-time interruption
HIGH sensitivity
🔧 Function Calling
Tool declarations per mode
Structured responses
📝 Summary Gen
Session summaries
Daily briefings
FIREBASE & GOOGLE CLOUD
🔑 Firebase Auth
Google · Apple · Email
🗄️ Cloud Firestore
Sessions · Memories · Exports
🔒 Secret Manager
GEMINI_API_KEY injection
📦 Cloud Storage
Documents · Media
1
Camera + Audio stream via WebSocket
2
Relay to Gemini via GenAI SDK
3
Tool calls dispatched + responses returned
4
Audio + UI actions streamed back to app
5
Persist to Firestore & Cloud Storage
Audio Stream
Video Frames
Tool Dispatch
Data Persistence
10,600+
Lines of Code
17
Agentic Tools
4
Agent Modes
122
Unit Tests
9
Cloud Services
<1s
First Response