Post-Crash AI Assistant: On-Device Voice Triage After a Crash

This project is an independent proof-of-concept built for learning purposes using synthetic/demo data. It is not affiliated with or based on any proprietary systems or data.

What this is

This is an add-on module, not a crash detection app.

Apps like Life360, OtoZen, Cambridge Mobile Telematics, and Google Pixel's built-in crash detection already handle detecting when an accident happens. The moment they raise a crash signal is exactly where this module starts. It receives the crash data (speed at impact, airbag status, location) and immediately starts a voice conversation with the driver to assess their condition.

The idea: crash detection is a solved problem for many apps. Post-crash assistance on the phone is not. This fills that gap.

What it does

Once a crash event is received, the module takes over:

Speaks to the driver immediately via Text-to-Speech
Listens for a response via the microphone (Android SpeechRecognizer)
The local Gemma LLM processes the response and decides what to ask next
If the driver stops responding for 15-20 seconds, emergency SMS is sent automatically

The LLM runs entirely on-device with no internet connection. This matters because crash victims are often in areas with no signal, or the crash itself may have damaged connectivity.

Architecture

Crash signal received (CrashInfo: speed, airbagsDeployed, location, timestamp)
        ↓
CrashCognitiveAidAgent (multi-turn conversation loop)
  ├── PromptBuilder → context-aware prompt with crash details
  ├── LocalGemmaLLM → MediaPipe LlmInference (Gemma 3 1B IT)
  ├── TextToSpeechManager → speaks response to driver
  ├── Android SpeechRecognizer → captures driver's voice
  └── Timeout (15-20s) → EmergencyMessenger → SMS

CrashCognitiveAidAgent manages the full loop. Each iteration conditions on the previous one: speak, listen, process, generate the next prompt, repeat. The conversation history is passed forward so the LLM maintains context across turns.

Key Components

CrashInfo (data class):

data class CrashInfo(
    val speed: Int,
    val airbagsDeployed: Boolean,
    val location: String?,
    val timestamp: Long
)

PromptBuilder:

First prompt includes crash context: speed, airbag status, driver's first response
Follow-up prompts include the full conversation history for continuity
Instructs the LLM to act as a calm, medical-grade first responder

TextToSpeechManager:

Wraps Android TTS with a suspend function speakAndWait() that blocks until speech completes before the mic opens
Language: US English

EmergencyMessenger:

Sends SMS to hardcoded emergency contacts on timeout or non-response
Displays toast confirmation
Uses Android SmsManager

Timeout-Based Safety

Agent speaks
    → waits for voice input (15s window)
        → response received: continue conversation
        → no response after 15-20s: EmergencyMessenger.sendSms()

The timeout is the most critical design decision. Too short and a groggy but conscious driver triggers a false alarm. Too long and you waste critical minutes. 15 seconds was the value that felt right after testing.

Tech Stack

| Layer | Technology | |---|---| | Language | Kotlin 2.0.0 | | UI | Jetpack Compose + Material 3 | | On-device LLM | MediaPipe tasks-genai 0.10.23 (Gemma 3 1B IT int4) | | Voice Input | Android SpeechRecognizer | | Voice Output | Android TextToSpeech API | | Emergency | Android SmsManager | | Async | Kotlin Coroutines |

UI

A simple chat history (LazyColumn) shows the conversation as it unfolds: agent messages and driver responses. There is a voice input button and a Start Assessment button. The interface is intentionally minimal because in a crash situation, the driver's cognitive load must be as close to zero as possible.

What I Learned

Agent loops on a phone work well for focused tasks. A TTS + STT + LLM loop is surprisingly effective when the problem is narrow and the prompt is tight.
Timeout design is the hardest part in emergency scenarios. How long is too long before declaring someone unresponsive? There is no perfect answer, just tradeoffs.
On-device AI is non-negotiable for safety-critical use cases. A crash victim in a remote area has no signal. Cloud AI would fail exactly when it is needed most.
LLM output must be cleaned before TTS. Models output markdown naturally. Bold text and code blocks sound terrible when spoken aloud. Strip everything before passing to the TTS engine.
Small models need tight role constraints. A 1B model will ramble without a clear, firm system role. In an emergency, a rambling AI wastes seconds that matter.