technical paper · v1.2.0 · on-device

How DailyVox
actually works.

A deep technical breakdown of the on-device AI pipeline behind the Twin. Nine Apple frameworks. Zero third-party SDKs. No cloud calls. Every layer — capture, transcription, NLP, personality modeling, storage — runs on your phone.

0 network calls for AI 9 Apple frameworks AES-256-GCM backups Neural Engine inference Data Not Collected
§01

Architecture overview.

pipeline

Every piece of data in DailyVox flows through a pipeline that runs entirely on the device. There are zero network calls for AI processing. Here is the full system architecture.

INPUT Microphone ─▶ AVAudioEngine · AAC 44.1kHz TRANSCRIPTION SFSpeechRecognizer (requiresOnDeviceRecognition = true) iOS 26: SpeechAnalyzer replaces this NLP ANALYSIS NLTagger ─▶ sentiment · NER · POS · language ID PERSONALITY MODEL TwinEngine ├▶ CommunicationStyle (TTR, formality, directness) ├▶ EmotionalSignature (valence, arousal, dominance) ├▶ PersonalKnowledgeGraph (entities + emotional weights) └▶ TwinPredictions (temporal patterns, forecasts) STORAGE Core Data ─▶ NSPersistentCloudKitContainer Local SQLite iCloud (optional, encrypted) PRESENTATION SwiftUI ─▶ WidgetKit AppIntents · Siri

The key constraint: data never leaves the device for processing. Transcription runs on the Neural Engine. NLP runs locally. The Twin is computed and stored in Core Data. The only optional network path is Apple's encrypted iCloud sync — which the user can disable.

§02

The on-device stack.

apple.frameworks[]

DailyVox uses nine Apple frameworks to build a full AI pipeline without any third-party dependencies or server-side processing.

Sp
SHIPPED

SFSpeechRecognizer

Speech.framework

The primary transcription engine in v1.0 – 1.x. Converts spoken audio to text entirely on-device.

  • requiresOnDeviceRecognition = true ensures zero network transmission
  • Input: AAC audio at 44.1 kHz via AVAudioEngine
  • 60+ languages with on-device models
  • Real-time partial results for live feedback
  • Runs on Apple Neural Engine
Sa
NEXT · v2.0

SpeechAnalyzer

Speech.framework · iOS 26

Apple's next-generation speech recognition framework, replacing SFSpeechRecognizer in v2.0.

  • Significantly faster recognition, lower latency
  • Native long-form audio without session timeouts
  • No user setup required (no permission prompts for on-device)
  • Volatile results for instant partial feedback
  • Built for sustained recording — ideal for journaling
Nl
SHIPPED

NLTagger

NaturalLanguage.framework

The core NLP engine that extracts meaning from transcribed text. Runs multiple analysis passes per entry.

  • Sentiment scoring — sentence-level valence −1.0 → +1.0
  • Named Entity Recognition — people, places, organisations, dates
  • Part-of-Speech tagging — verb density, adjective richness
  • Language ID — auto-detect entry language
  • All tag schemes use on-device CoreML models
Ne
NEXT · v1.4

NLEmbedding

512-dim word / sentence vectors

Generates dense vector representations of journal entries for semantic search and clustering.

  • 512-dim sentence embeddings per entry
  • Cosine similarity for semantic search
  • K-means clustering for hidden thematic groupings
  • Foundation for v2.0's RAG retrieval layer
  • Vectors persisted in Core Data alongside entry text
Fm
NEXT · v2.0

Foundation Models

iOS 26 · 3B on-device LLM

Apple's on-device large language model, enabling conversational Twin interactions in v2.0.

  • LanguageModelSession — multi-turn conversation with transcript memory
  • Tool calling — Twin autonomously queries Core Data via custom Tool protocol
  • @Generable — type-safe structured outputs (mood reports as Swift structs)
  • streamResponse() — real-time streaming chat UI
  • Dynamic instructions from TwinEngine for tone matching
  • Requires iPhone 15 Pro+. Entire pipeline on-device.
Cd
SHIPPED

Core Data + CloudKit

NSPersistentCloudKitContainer

Local-first persistence with optional encrypted cloud sync across devices.

  • SQLite wrapped by NSPersistentCloudKitContainer
  • Local-first: app works fully offline
  • AIState entity stores the Twin as Codable JSON
  • iCloud sync uses Apple's encrypted infrastructure
  • Sync is optional — user can disable entirely
Ck
SHIPPED

CryptoKit

AES-256-GCM

Military-grade encryption for backup exports and sensitive data at rest.

  • AES-256-GCM authenticated encryption for backups
  • User passphrase for key derivation
  • Encrypted JSON export for device migration
La
SHIPPED

LocalAuthentication

Face ID · Touch ID · LAContext

Biometric authentication gating access to journal entries.

  • Face ID and Touch ID via LAContext
  • Biometric keys held in the Secure Enclave
  • App lock with configurable auto-lock timeout
  • Falls back to device passcode
Wk
SHIPPED

WidgetKit + AppIntents

Home · Lock · Siri Shortcuts

Home & Lock-screen widgets. AppIntents-powered Siri Shortcuts for hands-free entries.

  • Mood and streak widgets
  • Hands-free voice entry via Siri
  • AppIntents make Siri aware of DailyVox operations
§03

The Twin Engine.

model card

The TwinEngine is a custom personality-modeling system that builds a multi-dimensional profile of the user from their voice journal entries. It uses no external models or APIs. The entire model is computed from NLTagger output and stored as serialized JSON in Core Data's AIState entity.

TwinEngine v1.2 · SHIPPED
storage: Core Data / AIState · size: ~12 KB · network: 0 rpc
Architecture
rule-based + NLP
NLTagger signals feed four sub-models. No external ML weights required.
Update cadence
per entry · online
Every voice entry updates the Twin incrementally. No batch jobs.
Inputs
transcript · sentiment · NER
Text, valence score, detected entities, timestamp, audio metadata.
Outputs
personality · predictions
Traits, mood forecast, trigger topics, entity–emotion graph.

The engine consists of four interconnected sub-models.

CommunicationStyle

STYLE

How the user expresses themselves. Updated with each entry.

  • Type-Token Ratio (vocabulary richness)
  • Expressiveness score (0 – 1)
  • Directness score (0 – 1)
  • Formality score (0 – 1)
  • Signature words + frequency map
  • Average sentence length
  • Pronoun patterns (I vs we)

EmotionalSignature

EMOTION

The user's emotional baseline and patterns over time.

  • Valence baseline (positive / negative)
  • Arousal baseline (energy level)
  • Dominance baseline (control feeling)
  • Morning vs evening mood patterns
  • Weekday vs weekend patterns
  • Trigger topics with correlation scores
  • Emotional volatility index

PersonalKnowledgeGraph

GRAPH

A network of people, places, and topics with emotional weights.

  • NER-extracted entities (person, place, org)
  • Emotional weight per entity (−1 → +1)
  • Mention frequency over time
  • Co-occurrence relationships
  • Entity–mood correlation tracking
  • Topic clusters from entity groupings

TwinPredictions

PREDICT

Forecasts based on temporal pattern analysis.

  • Day-of-week mood forecasting
  • Time-of-day emotional patterns
  • Trend direction (improving / declining)
  • Seasonal pattern detection
  • Trigger anticipation from schedule
  • Confidence scores per prediction
Storage model: all four sub-models are Swift Codable structs serialized to JSON and stored in a single Core Data entity (AIState). The entire personality model can be loaded in one fetch, updated incrementally, and synced across devices as a single atomic object. No external database. No vector store until v1.4. Just Core Data.
§04

Privacy architecture.

zero cloud

Privacy is not a feature of DailyVox. It is the architectural constraint every technical decision is built around. The system is designed so that private data physically cannot leave the device for processing.

Zero network processing

Every AI operation runs on the device's Neural Engine. Transcription uses requiresOnDeviceRecognition = true. NLTagger runs locally. The Twin is computed and stored in Core Data. There are no API calls, no cloud functions, no telemetry on journal content.

No third-party SDKs

DailyVox contains zero third-party dependencies for core functionality. No analytics SDKs. No crash reporting that sends journal content. No ad networks. The only external code is Google Analytics on this website (not in the app) and Apple's own frameworks.

Apple's "Data Not Collected"

DailyVox carries Apple's "Data Not Collected" privacy label on the App Store. This is the strictest category — the app collects no data, linked or unlinked to the user's identity.

Cloud AI journal vs DailyVox

typical cloud AI journal DailyVox
audio processingsent to cloud serverson-device Neural Engine
AI model locationremote API (OpenAI etc)Apple on-device models
text analysiscloud NLP serviceNLTagger (local)
data storagecompany serversCore Data · SQLite on device
account requiredyes (email, password)no
third-party SDKsanalytics, crash, adsnone
privacy label"Data Linked to You""Data Not Collected"
works offlinenoyes, fully
subscription$5–15 / monthfree
who can read your journalcompany, employees, sub-processorsonly you
§05

Technical roadmap.

build log

Where DailyVox has been, what's being built now, and where it's going. Each version adds a layer to the on-device AI stack.

Shipped · v1.0

Voice journaling + on-device AI

Core voice journaling with fully on-device transcription, NLP, encrypted storage, biometric lock, widgets, Siri Shortcuts.

SFSpeechRecognizerNLTaggerCore DataCryptoKitWidgetKitAppIntents
Shipped · v1.1

Twin + personality model

Custom TwinEngine with communication style, emotional signature, knowledge graph, and temporal mood forecasting.

TwinEngineCommunicationStyleEmotionalSignaturePersonalKnowledgeGraphTwinPredictions
Shipped · v1.2

Ask Your Twin + social sharing

TwinChatView with pattern-matched query system. ShareablePersonalityCardView renders at 3× for Instagram Stories and X. Review prompts via SKStoreReviewController at milestone entries.

TwinChatViewShareablePersonalityCardViewSKStoreReviewControllerUIActivityViewControllerImageRenderer
Next · v1.4

Semantic search + proactive insights

NLEmbedding for 512-dim sentence embeddings. Cosine similarity vector search. Z-score anomaly detection. K-means clustering. Foundation for v2.0 RAG.

NLEmbeddingCosine SimilarityAnomaly DetectionK-Means
v1.4 · macOS, Apple Watch & Multi-Language

Desktop Twin + wrist capture + localization

Native macOS target — same SwiftUI codebase, sidebar navigation, Twin accessible from the desktop. Apple Watch companion (WatchKit) for voice mood check-ins with Complications. String Catalogs for multi-language UI.

macOSSwiftUI (shared)WatchKitWatchConnectivityComplicationsString Catalogs
v2.0 · Conversational Twin (iOS 26)

Foundation Models + tool calling + SpeechAnalyzer

On-device 3B Foundation Model. Tool calling lets the Twin autonomously query Core Data. @Generable for structured output. streamResponse() for streaming. SpeechAnalyzer replaces SFSpeechRecognizer. Requires iPhone 15 Pro+. Zero network calls.

Foundation ModelsLanguageModelSessionTool Calling@GenerablestreamResponse()SpeechAnalyzer
v2.5 · Train Your Twin

LoRA fine-tuning — Twin learns to sound like you

Apple's Foundation Models Adapter Training toolkit for Low-Rank Adaptation. Export 100–1,000 entries as JSONL. Train a personal adapter on Mac. ~160 MB adapter delivered via Background Assets. Loaded via SystemLanguageModel(adapter:). Training data never leaves your Mac.

LoRA AdaptersAdapter TrainingJSONL ExportBackground Assets~160 MB adapter
v3.0 · True Digital Self

The most accurate mirror of yourself

After years of daily entries: a Twin that talks like you, sounds like you (Personal Voice), predicts your reactions, explains causality from past entries, and shows personality evolution over time. Full RAG, personal LoRA adapter, autonomous tool calling. Not a clone — it knows your narrated self, not your complete self. Thoughts you don't journal are invisible to it. Entirely on-device, exportable only by you.

Full RAGPersonal LoRACausal ReasoningPersonality EvolutionContext CondensationDigital Self-Preservation
§06

Research context.

related work

DailyVox exists at the intersection of on-device LLMs, personal AI, and mental-health technology. Several recent research papers explore adjacent ideas.

  1. [1] 2026
    Memory-Efficient Structured Backpropagation for On-Device LLM Fine-Tuning
    Efficient fine-tuning under mobile memory constraints — directly relevant to v2.5 LoRA adapter training.
  2. [2] 2025
    MoPHES: On-Device LLMs for Mobile Psychological Health
    Using on-device LLMs for psychological health applications — parallel to the Twin's emotional modelling.
  3. [3] 2024
    PocketLLM: On-Device Fine-Tuning for Personalized LLMs
    Personal model adaptation on mobile hardware — foundational for v2.5's personal adapter.
  4. [4] 2023
    PLMM: Personal Large Language Models on Mobile Devices
    Early architecture proposal for personal LLMs running on phones — the direction DailyVox pursues.
What makes DailyVox different: no existing paper covers DailyVox's specific combination — a private Twin built from voice journal data using on-device NLP (NLTagger, NLEmbedding) plus Apple Foundation Models. The combination of voice-first input, personality modelling from NER/sentiment, and on-device LLM generation with tool calling for autonomous retrieval is a novel architecture. To our knowledge, DailyVox is the first app to attempt this full pipeline privately on-device.
§07

Open source.

auditable

DailyVox is open source. The full codebase — the TwinEngine, all NLP processing, the Core Data stack, the SwiftUI interface — is on GitHub.

Privacy-critical software should be auditable. If you claim data never leaves the device, people should be able to verify that claim by reading the code.

Build with us.

DailyVox is open source and contributions are welcome — improving the Twin engine, adding language support, building the Foundation Models integration.

View on GitHub
install · $0 · no account

Try DailyVox.

Free. Private. No account needed. All AI runs on your device.

Download free on the App Store