**Author:** Generator/Founder (me)
**Status:** In Review
**Related Document:** [[Product Plan]]
**Mantra:**
**Mantra:** *Keep your vibe, no interruptions.*
~~*Stop 'catching up' on reading. Start capturing what matters.*~~
## 1. Overview
This spec covers the technical implementation for the **Vibe Reader v1 MVP**. The core challenge is creating a "non-interruptive" capture flow that can be triggered from a persistent notification and the device lock screen.
The solution is to use a **`foregroundService`** managed by Android's **`MediaSession`** framework. This presents our app as a "media player" to the OS, granting us lock screen controls.
## 2. Key Components
- **Platform:** Android (Kotlin), Min SDK 26 (Oreo)
- **Database:** RoomDB (local, on-device SQLite)
- **UI:** Jetpack Compose
- **Core Tech:**
- `MediaSession` & `MediaStyle` Notification
- `ForegroundService`
- `Android.speech.SpeechRecognizer` (Speech-to-Text)
- `Android.speech.tts.TextToSpeech` (Text-to-Speech)
- `Room` (Database)
- `Retrofit` (for Dictionary API)
## 3. Data Model (RoomDB Schema)
```
/* A "Session" is a single reading period, tied to a book. */
CREATE TABLE Sessions (
session_id INTEGER PRIMARY KEY AUTOINCREMENT,
book_title TEXT NOT NULL,
start_time INTEGER NOT NULL, -- Unix Timestamp
end_time INTEGER,
status TEXT NOT NULL DEFAULT 'active' -- 'active', 'inactive'
);
/* A "Word" is a defined word, linked to a session. */
CREATE TABLE Words (
word_id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id INTEGER NOT NULL,
term TEXT NOT NULL,
definition TEXT NOT NULL,
timestamp INTEGER NOT NULL,
is_favorite INTEGER NOT NULL DEFAULT 0, -- 0 = false, 1 = true
FOREIGN KEY (session_id) REFERENCES Sessions(session_id)
);
/* A "Quote" is a saved quote, linked to a session. */
CREATE TABLE Quotes (
quote_id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id INTEGER NOT NULL,
content TEXT NOT NULL,
timestamp INTEGER NOT NULL,
is_favorite INTEGER NOT NULL DEFAULT 0,
FOREIGN KEY (session_id) REFERENCES Sessions(session_id)
);
```
## 4. User Flow & Logic Diagrams
### Flow 1: App Launch & Session Management
This flow outlines how a user starts, ends, or resumes a session. The `MediaSession` is the key technical component that enables lock screen functionality.
```
graph TD
A[User Launches App] --> B{Active Session in local DB?};
B -- Yes --> C[Show 'Session' Tab (Active State)];
C --> D[Display Session Title & Timer];
C --> E[Start MediaSession (creates MediaStyle Notification & lock screen controls)];
B -- No --> F[Show 'Session' Tab (No Session State)];
F --> G[Show 'New Session' Button & 'Recent Sessions' List];
G -- Taps 'New Session' --> H[Show 'Start Session' Dialog];
H -- Enters Book Title & Taps 'Start' --> I[1. Create new 'Sessions' row (status='active')];
I --> J[2. Set this session_id as 'active' in DB];
J --> C;
G -- Taps a 'Recent Session' --> K[1. Get session_id from tap];
K --> L[2. Set this session_id as 'active' in DB];
L --> C;
D -- Taps 'End Session' --> M[1. Update 'Sessions' row: set status 'inactive', save end_time];
M --> N[2. Calculate & save total_time];
N --> O[3. End MediaSession (removes notification & lock screen controls)];
O --> F;
```
### Flow 2: 'Define Word' Quick-Capture Flow
This is the core value loop, triggered from the lock screen.
```
graph TD
A[User taps 'Define Word' (from Notification OR Lock Screen)] --> B[Show 'Listening' Overlay];
B --> C(Call Android SpeechRecognizer);
C --> D{Word Received?};
D -- Yes --> E[Call Dictionary API w/ word];
E --> F{Definition Found?};
F -- Yes --> G[1. Save to 'Words' table (w/ session_id, timestamp)];
G --> H[2. Show 'Success' checkmark];
H --> I[Dismiss Overlay];
D -- No / Timeout --> J[Show 'Try Again' message];
J --> I;
F -- No --> K[Show 'Not Found' message];
K --> I;
```
### Flow 3: 'Save Quote' Quick-Capture Flow
This flow includes the crucial verification step.
```
graph TD
A[User taps 'Save Quote' (from Notification OR Lock Screen)] --> B[Show 'Listening' Overlay];
B --> C(Call Android SpeechRecognizer);
C --> D{Quote Received?};
D -- Yes --> E[1. Show Quote Text];
E --> F[2. Show 'Confirm' & 'Retry' buttons];
E --> G(Call TextToSpeech to read quote back);
F -- Taps 'Confirm' --> H[1. Save to 'Quotes' table (w/ session_id, timestamp)];
H --> I[2. Show 'Success' checkmark];
I --> J[Dismiss Overlay];
F -- Taps 'Retry' --> B;
D -- No / Timeout --> K[Show 'Try Again' message];
K --> J;
```
## 5. UI Wireframes (Low-Fidelity)
- **Screen 1: Session Tab (No Active Session)**
- Large "Start New Session" Button
- List of "Recent Sessions" (book titles)
- Navigation: [Session] | [Review]
- **Screen 2: Session Tab (Active Session)**
- Large Text: "Now Reading: [Book Title]"
- Large Timer: "00:24:15"
- Large "End Session" Button
- Navigation: [Session] | [Review]
- **Screen 3: Review Tab**
- Tabs: [All] | [Words] | [Quotes] | [Favorites]
- Filter: "Filter by Session/Book..."
- List of Cards (each card is a word or quote with its session title and timestamp)
- **Overlay 1: Lock Screen / Notification**
- `MediaStyle` Notification
- Title: "Vibe Reader: [Book Title]"
- Text: "Session in progress..."
- Actions: [Define Word] | [Save Quote]
- **Overlay 2: "Listening..."**
- Full-screen modal overlay
- Animated microphone icon
- Text: "Listening..." or "I heard: [Quote text]. Correct?"
This section provides a low-fidelity text-based mockup of the key app screens.
```
//===========================================//
// Screen 1: Session (No Active Session) //
//===========================================//
// //
// [ Start New Session ] //
// //
// Recent Sessions: //
// -------------------- //
// [ Book Title A > ] //
// [ Book Title B > ] //
// //
// //
// //
// ========================= //
// | [Session] | Review | //
// ========================= //
//===========================================//
//===========================================//
// Screen 2: Session (Active Session) //
//===========================================//
// //
// Now Reading: //
// Book Title A //
// //
// 00:24:15 //
// //
// [ End Session ] //
// //
// //
// ========================= //
// | [Session] | Review | //
// ========================= //
//===========================================//
//===========================================//
// Screen 3: Review Tab //
//===========================================//
// //
// [ All ] [ Words ] [ Quotes ] [ Favs ] //
// //
// [ Filter by Session... v ] //
// ---------------------------------- //
// | "This is a saved quote..." | //
// | - Book Title A (Oct 26) * | //
// ---------------------------------- //
// | Vexillology (n.) | //
// | - The study of flags... | //
// | - Book Title A (Oct 26) * | //
// ---------------------------------- //
// ========================= //
// | Session | [Review] | //
// ========================= //
//===========================================//
//===========================================//
// Overlay 1: Lock Screen / Notification //
//===========================================//
// --------------------------------------- //
// | Vibe Reader: Book Title A | //
// | Session in progress... | //
// | | //
// | [ Define Word ] [ Save Quote ] | //
// --------------------------------------- //
//===========================================//
//===========================================//
// Overlay 2: Listening (Quote Capture) //
//===========================================//
// /---------------------------\ //
// | | //
// | (( (O) )) | //
// | | //
// | "I heard: 'To be or not' | //
// | ...is that correct?" | //
// | | //
// | [ Confirm ] [ Retry ] | //
// \---------------------------/ //
//===========================================//
```
## 6. External Service Dependencies
- **Persistent Notification:** Android **`MediaSession`** and **`MediaStyle` Notification**. This allows controls to appear on the lock screen (like a music player) and ensures the `foregroundService` is handled correctly.
- **Speech-to-Text:** On-device `Android.speech.SpeechRecognizer` for speed and offline capability.
- **Dictionary API:** TBD. We need a free/fremium REST API. (e.g., Free Dictionary API, Merriam-Webster). The drawback is the reliability here, having tooled around with them already.
## 7. Open Questions & Risks
1. **Notification Permission:** Android 13+ requires notification permission. This is a critical onboarding step, required for the `MediaSession` to work.
2. **Service Stability:** How do we ensure the `foregroundService` is stable and battery-efficient? Using `MediaSession` is the standard, approved way to manage this.
3. **Offline Support:** `SpeechRecognizer` should work on-device. The `Dictionary API` will not. We need a queueing system to "define later" if offline.