**Author:** Generator/Founder (me) **Status:** In Review **Related Document:** [[Product Plan]] **Mantra:** **Mantra:** *Keep your vibe, no interruptions.* ~~*Stop 'catching up' on reading. Start capturing what matters.*~~ ## 1. Overview This spec covers the technical implementation for the **Vibe Reader v1 MVP**. The core challenge is creating a "non-interruptive" capture flow that can be triggered from a persistent notification and the device lock screen. The solution is to use a **`foregroundService`** managed by Android's **`MediaSession`** framework. This presents our app as a "media player" to the OS, granting us lock screen controls. ## 2. Key Components - **Platform:** Android (Kotlin), Min SDK 26 (Oreo) - **Database:** RoomDB (local, on-device SQLite) - **UI:** Jetpack Compose - **Core Tech:** - `MediaSession` & `MediaStyle` Notification - `ForegroundService` - `Android.speech.SpeechRecognizer` (Speech-to-Text) - `Android.speech.tts.TextToSpeech` (Text-to-Speech) - `Room` (Database) - `Retrofit` (for Dictionary API) ## 3. Data Model (RoomDB Schema) ``` /* A "Session" is a single reading period, tied to a book. */ CREATE TABLE Sessions ( session_id INTEGER PRIMARY KEY AUTOINCREMENT, book_title TEXT NOT NULL, start_time INTEGER NOT NULL, -- Unix Timestamp end_time INTEGER, status TEXT NOT NULL DEFAULT 'active' -- 'active', 'inactive' ); /* A "Word" is a defined word, linked to a session. */ CREATE TABLE Words ( word_id INTEGER PRIMARY KEY AUTOINCREMENT, session_id INTEGER NOT NULL, term TEXT NOT NULL, definition TEXT NOT NULL, timestamp INTEGER NOT NULL, is_favorite INTEGER NOT NULL DEFAULT 0, -- 0 = false, 1 = true FOREIGN KEY (session_id) REFERENCES Sessions(session_id) ); /* A "Quote" is a saved quote, linked to a session. */ CREATE TABLE Quotes ( quote_id INTEGER PRIMARY KEY AUTOINCREMENT, session_id INTEGER NOT NULL, content TEXT NOT NULL, timestamp INTEGER NOT NULL, is_favorite INTEGER NOT NULL DEFAULT 0, FOREIGN KEY (session_id) REFERENCES Sessions(session_id) ); ``` ## 4. User Flow & Logic Diagrams ### Flow 1: App Launch & Session Management This flow outlines how a user starts, ends, or resumes a session. The `MediaSession` is the key technical component that enables lock screen functionality. ``` graph TD A[User Launches App] --> B{Active Session in local DB?}; B -- Yes --> C[Show 'Session' Tab (Active State)]; C --> D[Display Session Title & Timer]; C --> E[Start MediaSession (creates MediaStyle Notification & lock screen controls)]; B -- No --> F[Show 'Session' Tab (No Session State)]; F --> G[Show 'New Session' Button & 'Recent Sessions' List]; G -- Taps 'New Session' --> H[Show 'Start Session' Dialog]; H -- Enters Book Title & Taps 'Start' --> I[1. Create new 'Sessions' row (status='active')]; I --> J[2. Set this session_id as 'active' in DB]; J --> C; G -- Taps a 'Recent Session' --> K[1. Get session_id from tap]; K --> L[2. Set this session_id as 'active' in DB]; L --> C; D -- Taps 'End Session' --> M[1. Update 'Sessions' row: set status 'inactive', save end_time]; M --> N[2. Calculate & save total_time]; N --> O[3. End MediaSession (removes notification & lock screen controls)]; O --> F; ``` ### Flow 2: 'Define Word' Quick-Capture Flow This is the core value loop, triggered from the lock screen. ``` graph TD A[User taps 'Define Word' (from Notification OR Lock Screen)] --> B[Show 'Listening' Overlay]; B --> C(Call Android SpeechRecognizer); C --> D{Word Received?}; D -- Yes --> E[Call Dictionary API w/ word]; E --> F{Definition Found?}; F -- Yes --> G[1. Save to 'Words' table (w/ session_id, timestamp)]; G --> H[2. Show 'Success' checkmark]; H --> I[Dismiss Overlay]; D -- No / Timeout --> J[Show 'Try Again' message]; J --> I; F -- No --> K[Show 'Not Found' message]; K --> I; ``` ### Flow 3: 'Save Quote' Quick-Capture Flow This flow includes the crucial verification step. ``` graph TD A[User taps 'Save Quote' (from Notification OR Lock Screen)] --> B[Show 'Listening' Overlay]; B --> C(Call Android SpeechRecognizer); C --> D{Quote Received?}; D -- Yes --> E[1. Show Quote Text]; E --> F[2. Show 'Confirm' & 'Retry' buttons]; E --> G(Call TextToSpeech to read quote back); F -- Taps 'Confirm' --> H[1. Save to 'Quotes' table (w/ session_id, timestamp)]; H --> I[2. Show 'Success' checkmark]; I --> J[Dismiss Overlay]; F -- Taps 'Retry' --> B; D -- No / Timeout --> K[Show 'Try Again' message]; K --> J; ``` ## 5. UI Wireframes (Low-Fidelity) - **Screen 1: Session Tab (No Active Session)** - Large "Start New Session" Button - List of "Recent Sessions" (book titles) - Navigation: [Session] | [Review] - **Screen 2: Session Tab (Active Session)** - Large Text: "Now Reading: [Book Title]" - Large Timer: "00:24:15" - Large "End Session" Button - Navigation: [Session] | [Review] - **Screen 3: Review Tab** - Tabs: [All] | [Words] | [Quotes] | [Favorites] - Filter: "Filter by Session/Book..." - List of Cards (each card is a word or quote with its session title and timestamp) - **Overlay 1: Lock Screen / Notification** - `MediaStyle` Notification - Title: "Vibe Reader: [Book Title]" - Text: "Session in progress..." - Actions: [Define Word] | [Save Quote] - **Overlay 2: "Listening..."** - Full-screen modal overlay - Animated microphone icon - Text: "Listening..." or "I heard: [Quote text]. Correct?" This section provides a low-fidelity text-based mockup of the key app screens. ``` //===========================================// // Screen 1: Session (No Active Session) // //===========================================// // // // [ Start New Session ] // // // // Recent Sessions: // // -------------------- // // [ Book Title A > ] // // [ Book Title B > ] // // // // // // // // ========================= // // | [Session] | Review | // // ========================= // //===========================================// //===========================================// // Screen 2: Session (Active Session) // //===========================================// // // // Now Reading: // // Book Title A // // // // 00:24:15 // // // // [ End Session ] // // // // // // ========================= // // | [Session] | Review | // // ========================= // //===========================================// //===========================================// // Screen 3: Review Tab // //===========================================// // // // [ All ] [ Words ] [ Quotes ] [ Favs ] // // // // [ Filter by Session... v ] // // ---------------------------------- // // | "This is a saved quote..." | // // | - Book Title A (Oct 26) * | // // ---------------------------------- // // | Vexillology (n.) | // // | - The study of flags... | // // | - Book Title A (Oct 26) * | // // ---------------------------------- // // ========================= // // | Session | [Review] | // // ========================= // //===========================================// //===========================================// // Overlay 1: Lock Screen / Notification // //===========================================// // --------------------------------------- // // | Vibe Reader: Book Title A | // // | Session in progress... | // // | | // // | [ Define Word ] [ Save Quote ] | // // --------------------------------------- // //===========================================// //===========================================// // Overlay 2: Listening (Quote Capture) // //===========================================// // /---------------------------\ // // | | // // | (( (O) )) | // // | | // // | "I heard: 'To be or not' | // // | ...is that correct?" | // // | | // // | [ Confirm ] [ Retry ] | // // \---------------------------/ // //===========================================// ``` ## 6. External Service Dependencies - **Persistent Notification:** Android **`MediaSession`** and **`MediaStyle` Notification**. This allows controls to appear on the lock screen (like a music player) and ensures the `foregroundService` is handled correctly. - **Speech-to-Text:** On-device `Android.speech.SpeechRecognizer` for speed and offline capability. - **Dictionary API:** TBD. We need a free/fremium REST API. (e.g., Free Dictionary API, Merriam-Webster). The drawback is the reliability here, having tooled around with them already. ## 7. Open Questions & Risks 1. **Notification Permission:** Android 13+ requires notification permission. This is a critical onboarding step, required for the `MediaSession` to work. 2. **Service Stability:** How do we ensure the `foregroundService` is stable and battery-efficient? Using `MediaSession` is the standard, approved way to manage this. 3. **Offline Support:** `SpeechRecognizer` should work on-device. The `Dictionary API` will not. We need a queueing system to "define later" if offline.