From 4dca44dbbe93c04dc1643100279b9719ad966d26 Mon Sep 17 00:00:00 2001 From: goose Date: Sat, 14 Feb 2026 13:39:57 -0300 Subject: [PATCH] Research: MongoDB schema design complete - Zero-knowledge encryption for ALL sensitive data + metadata - Blood pressure example: value + type + unit ALL encrypted - 9 collections: users, families, profiles, health_data, lab_results, medications, appointments, shares, refresh_tokens - Client-side encryption (AES-256-GCM, PBKDF2) - Server NEVER decrypts data - Privacy-preserving queries (plaintext fields: userId, profileId, familyId, date, tags) - Tagging system for encrypted data search - Date range queries (plaintext dates) Key principle: - Both value AND metadata encrypted (e.g., "blood_pressure" + "120/80") - No plaintext metadata leaks - Server stores ONLY encrypted data Updated tech stack decisions with MongoDB schema All major research complete (Rust, Mobile, Web, State, Auth, Database) Next: Backend development (Axum + MongoDB) --- .../2026-02-14-mongodb-schema-decision.md | 183 +++ ...26-02-14-mongodb-schema-design-research.md | 1089 +++++++++++++++++ .../2026-02-14-tech-stack-decision.md | 191 ++- 3 files changed, 1427 insertions(+), 36 deletions(-) create mode 100644 thoughts/research/2026-02-14-mongodb-schema-decision.md create mode 100644 thoughts/research/2026-02-14-mongodb-schema-design-research.md diff --git a/thoughts/research/2026-02-14-mongodb-schema-decision.md b/thoughts/research/2026-02-14-mongodb-schema-decision.md new file mode 100644 index 0000000..3e8900d --- /dev/null +++ b/thoughts/research/2026-02-14-mongodb-schema-decision.md @@ -0,0 +1,183 @@ +# MongoDB Schema Design Decision Summary + +**Date**: 2026-02-14 +**Decision**: **Zero-Knowledge Encryption for All Sensitive Data + Metadata** + +--- + +## Core Principle + +**ALL sensitive data AND metadata must be encrypted client-side before reaching MongoDB.** + +### Example: Blood Pressure Reading + +**Before encryption** (client-side): +```javascript +{ + value: "120/80", + type: "blood_pressure", + unit: "mmHg", + date: "2026-02-14T10:30:00Z" +} +``` + +**After encryption** (stored in MongoDB): +```javascript +{ + healthDataId: "health-123", + userId: "user-456", + profileId: "profile-789", + familyId: "family-012", + + // Encrypted (value + metadata) + healthData: [ + { + encrypted: true, + data: "a1b2c3d4...", + iv: "e5f6g7h8...", + authTag: "i9j0k1l2..." + } + ], + + // Metadata (plaintext) + createdAt: ISODate("2026-02-14T10:30:00Z"), + updatedAt: ISODate("2026-02-14T10:30:00Z"), + dataSource: "healthKit" +} +``` + +--- + +## Collections Summary + +| Collection | Purpose | Encrypted Fields | Plaintext Fields | +|-----------|---------|------------------|-----------------| +| **users** | Authentication | encryptedRecoveryPhrase | userId, email, passwordHash, tokenVersion, familyId, familyRole, permissions | +| **families** | Family structure | familyName, familyMetadata | familyId, members[*].userId, members[*].profileId, members[*].role | +| **profiles** | Person profiles | profileName, profileMetadata | profileId, userId, familyId, profileType | +| **health_data** | Health records | healthData[*] (value + metadata) | healthDataId, userId, profileId, familyId, createdAt, updatedAt, dataSource | +| **lab_results** | Lab tests | labData (value + metadata), labMetadata | labResultId, userId, profileId, familyId, createdAt, updatedAt, dataSource | +| **medications** | Medication tracking | medicationData (value + metadata), reminderSchedule | medicationId, userId, profileId, familyId, active, createdAt, updatedAt | +| **appointments** | Medical appointments | appointmentData (value + metadata), reminderSettings | appointmentId, userId, profileId, familyId, createdAt, updatedAt | +| **shares** | Shared data | encryptedData (share-specific password) | shareId, userId, documentId, collectionName, createdAt, expiresAt, accessCount, isRevoked | +| **refresh_tokens** | JWT tokens | None | jti, userId, createdAt, expiresAt, revoked | + +--- + +## Encryption Strategy + +### Client-Side Encryption + +**Encryption Flow**: +1. User enters health data +2. Client derives encryption key from password (PBKDF2) +3. Client encrypts health data (AES-256-GCM) +4. Client sends encrypted data to server +5. Server stores encrypted data in MongoDB +6. Server NEVER decrypts data + +### What Must Be Encrypted +- ✅ **Health data values** (e.g., "120/80") +- ✅ **Health data metadata** (e.g., "blood_pressure", "mmHg") +- ✅ **Lab test results** (e.g., "cholesterol", "200", "LabCorp") +- ✅ **Medication data** (e.g., "Aspirin", "100mg", "daily") +- ✅ **Appointment data** (e.g., "checkup", "Dr. Smith") +- ✅ **Profile data** (e.g., "John Doe", "1990-01-01") +- ✅ **Family data** (e.g., "Smith Family", "123 Main St") +### What Can Be Plaintext +- ✅ **User IDs** (userId, profileId, familyId) - for queries +- ✅ **Email addresses** - for authentication +- ✅ **Dates** (createdAt, updatedAt) - for sorting +- ✅ **Data sources** (healthKit, googleFit) - for analytics +- ✅ **Tags** (cardio, daily) - for client-side search + +--- + +## Privacy-Preserving Queries + +### 1. Plaintext Queries (Recommended) + +**Query by plaintext fields only**: +```javascript +const healthData = await db.health_data.find({ + userId: 'user-123', // Plaintext ✅ + profileId: 'profile-456', // Plaintext ✅ + familyId: 'family-789' // Plaintext ✅ +}).toArray(); + +// Client decrypts healthData[i].healthData[j] +``` + +### 2. Tagging System (Encrypted Search) + +**Client adds searchable tags to encrypted data**: +```javascript +const healthData = await db.health_data.find({ + userId: 'user-123', + tags: { $in: ['cardio', 'daily'] } // Plaintext tags ✅ +}).toArray(); + +// Client decrypts healthData[i].healthData[j] +``` + +### 3. Date Range Queries (Plaintext Dates) + +**Store dates as plaintext** (for range queries): +```javascript +const healthData = await db.health_data.find({ + userId: 'user-123', + date: { + $gte: ISODate("2026-02-01"), + $lte: ISODate("2026-02-28") + } +}).toArray(); + +// Client decrypts healthData[i].healthData[j] +``` + +--- + +## Technology Stack + +### Backend (Axum + MongoDB) +- **Axum 0.7.x**: Web framework +- **MongoDB 6.0+**: Database +- **Rust**: Server language +### Client (React Native + React) +- **AES-256-GCM**: Encryption algorithm +- **PBKDF2**: Key derivation function +- **Crypto API**: Node.js crypto / react-native-quick-crypto + +--- + +## Implementation Timeline + +- **Week 1**: Create MongoDB indexes +- **Week 1-2**: Implement client-side encryption (React Native + React) +- **Week 2-3**: Implement server-side API (Axum + MongoDB) +- **Week 3**: Test encryption flow +- **Week 3-4**: Test data migration (key rotation) +- **Week 4**: Test privacy-preserving queries +- **Week 4-5**: Performance testing +**Total**: 4-5 weeks + +--- + +## Next Steps + +1. Create MongoDB indexes for all collections +2. Implement client-side encryption (React Native + React) +3. Implement server-side API (Axum + MongoDB) +4. Test encryption flow (end-to-end) +5. Test data migration (key rotation) +6. Test privacy-preserving queries +7. Performance testing +8. Create API documentation +--- + +## References + +- [Comprehensive MongoDB Schema Research](./2026-02-14-mongodb-schema-design-research.md) +- [Normogen Encryption Guide](../encryption.md) +- [JWT Authentication Research](./2026-02-14-jwt-authentication-research.md) +- [Technology Stack Decisions](./2026-02-14-tech-stack-decision.md) diff --git a/thoughts/research/2026-02-14-mongodb-schema-design-research.md b/thoughts/research/2026-02-14-mongodb-schema-design-research.md new file mode 100644 index 0000000..e548009 --- /dev/null +++ b/thoughts/research/2026-02-14-mongodb-schema-design-research.md @@ -0,0 +1,1089 @@ +# MongoDB Schema Design for Normogen + +**Date**: 2026-02-14 +**Focus**: Zero-knowledge encryption for all sensitive data AND metadata +**Database**: MongoDB 6.0+ + +--- + +## Table of Contents +1. [Zero-Knowledge Encryption Requirements](#zero-knowledge-encryption-requirements) +2. [Database Architecture Overview](#database-architecture-overview) +3. [Collection Schemas](#collection-schemas) +4. [Encryption Strategy](#encryption-strategy) +5. [Indexing Strategy](#indexing-strategy) +6. [Privacy-Preserving Queries](#privacy-preserving-queries) +7. [Data Migration](#data-migration) +8. [Performance Considerations](#performance-considerations) +--- + +## Zero-Knowledge Encryption Requirements + +### Core Principle +**ALL sensitive data AND metadata must be encrypted client-side before reaching MongoDB.** + +### What Must Be Encrypted + +#### Health Data (Value + Metadata) +```javascript +// Blood pressure reading - BOTH value AND metadata encrypted +{ + value: "10", // Encrypted ❌ + metadata: { + type: "blood_pressure", // Encrypted ❌ + unit: "mmHg" // Encrypted ❌ + } +} + +// After encryption (stored in MongoDB) +{ + value: { + encrypted: true, + data: "a1b2c3d4...", + iv: "e5f6g7h8...", + authTag: "i9j0k1l2..." + }, + metadata: { + encrypted: true, + data: "m3n4o5p6...", + iv: "q7r8s9t0...", + authTag: "u1v2w3x4..." + } +} +``` + +#### What Can Be Plaintext +```javascript +// ONLY non-sensitive, non-identifying fields +{ + userId: "user-123", // Plaintext (for queries) ✅ + familyId: "family-456", // Plaintext (for family queries) ✅ + profileId: "profile-789", // Plaintext (for profile queries) ✅ + createdAt: ISODate("2026-02-14"), // Plaintext (for sorting) ✅ + updatedAt: ISODate("2026-02-14"), // Plaintext (for sorting) ✅ + + // ALL health data encrypted ❌ + healthData: [ + { + encrypted: true, + data: "...", + iv: "...", + authTag: "..." + } + ] +} +``` + +#### Why Metadata Must Be Encrypted + +**Problem**: If metadata is plaintext, attackers can infer sensitive information. + +**Example**: +```javascript +// BAD: Metadata plaintext (leaks information) +{ + userId: "user-123", + healthData: [ + { + type: "hiv_test", // Reveals HIV status + result: "positive", // Reveals HIV status + date: "2026-02-14", // Reveals when tested + doctor: "Dr. Smith", // Reveals healthcare provider + } + ] +} + +// GOOD: Metadata encrypted (privacy-preserving) +{ + userId: "user-123", + healthData: [ + { + encrypted: true, + data: "a1b2c3d4...", // Encrypted: type + result + date + doctor + iv: "e5f6g7h8...", + authTag: "i9j0k1l2..." + } + ] +} +``` + +--- + +## Database Architecture Overview + +### Database Structure +``` +normogen (database) +├── users (collection) +├── families (collection) +├── profiles (collection) +├── health_data (collection) +├── lab_results (collection) +├── medications (collection) +├── appointments (collection) +├── shares (collection) +└── refresh_tokens (collection) +``` + +### Data Flow +``` +Client (React Native / React) +├── User enters data +├── Client encrypts data (AES-256-GCM, PBKDF2) +├── Client sends encrypted data to server +│ +Server (Axum / Rust) +├── Server receives encrypted data +├── Server NEVER decrypts data +├── Server stores encrypted data in MongoDB +│ +MongoDB +├── Stores ONLY encrypted data +├── No plaintext sensitive data +└── Zero-knowledge architecture maintained +``` + +--- + +## Collection Schemas + +### 1. Users Collection + +**Purpose**: User authentication and account data + +**Schema**: +```javascript +{ + _id: ObjectId("..."), + + // Plaintext fields (for authentication) + userId: { type: String, unique: true, required: true }, + email: { type: String, index: true, required: true }, // Plaintext for login + passwordHash: { type: String, required: true }, // Plaintext (bcrypt hash) + tokenVersion: { type: Number, default: 1 }, // Plaintext (for JWT revocation) + + // Encrypted fields (zero-knowledge) + encryptedRecoveryPhrase: { + encrypted: true, + data: String, // Encrypted recovery phrase + iv: String, + authTag: String + }, + + // Family relationships + familyId: { type: String, index: true }, // Plaintext (for family queries) + familyRole: { type: String }, // Plaintext (parent, child, elderly) + permissions: [String], // Plaintext (for JWT permissions) + + // Metadata (plaintext) + createdAt: { type: Date, default: Date.now }, + updatedAt: { type: Date, default: Date.now }, + lastLoginAt: Date +} +``` + +**Encryption Notes**: +- ✅ **Plaintext**: `userId`, `email`, `passwordHash`, `tokenVersion`, `familyId`, `familyRole`, `permissions` +- ❌ **Encrypted**: `encryptedRecoveryPhrase` + +**Indexes**: +```javascript +// Indexes for performance +db.users.createIndex({ userId: 1 }, { unique: true }); +db.users.createIndex({ email: 1 }, { unique: true }); +db.users.createIndex({ familyId: 1 }); +db.users.createIndex({ createdAt: -1 }); // For sorting +``` + +--- + +### 2. Families Collection + +**Purpose**: Family structure and relationships + +**Schema**: +```javascript +{ + _id: ObjectId("..."), + + // Plaintext fields (for queries) + familyId: { type: String, unique: true, required: true }, + createdAt: { type: Date, default: Date.now }, + updatedAt: { type: Date, default: Date.now }, + + // Encrypted family name (privacy-preserving) + familyName: { + encrypted: true, + data: String, // Encrypted family name + iv: String, + authTag: String + }, + + // Encrypted family metadata + familyMetadata: { + encrypted: true, + data: String, // Encrypted metadata (address, phone, etc.) + iv: String, + authTag: String + }, + + // Plaintext family structure (for queries) + members: [ + { + userId: String, // Plaintext (for queries) + profileId: String, // Plaintext (for queries) + role: String, // Plaintext (parent, child, elderly) + permissions: [String] // Plaintext (for JWT) + } + ] +} +``` + +**Encryption Notes**: +- ✅ **Plaintext**: `familyId`, `members[*].userId`, `members[*].profileId`, `members[*].role`, `members[*].permissions` +- ❌ **Encrypted**: `familyName`, `familyMetadata` + +--- + +### 3. Profiles Collection + +**Purpose**: Person profiles (users can have multiple profiles) + +**Schema**: +```javascript +{ + _id: ObjectId("..."), + + // Plaintext fields (for queries) + profileId: { type: String, unique: true, required: true }, + userId: { type: String, index: true, required: true }, // Owner + familyId: { type: String, index: true }, + profileType: { type: String }, // self, child, elderly, pet + + // Encrypted profile data (privacy-preserving) + profileName: { + encrypted: true, + data: String, // Encrypted name (e.g., "John Doe") + iv: String, + authTag: String + }, + + // Encrypted profile metadata + profileMetadata: { + encrypted: true, + data: String, // Encrypted metadata (birth date, gender, etc.) + iv: String, + authTag: String + }, + + // Metadata (plaintext) + createdAt: { type: Date, default: Date.now }, + updatedAt: { type: Date, default: Date.now } +} +``` + +**Encryption Notes**: +- ✅ **Plaintext**: `profileId`, `userId`, `familyId`, `profileType` +- ❌ **Encrypted**: `profileName`, `profileMetadata` + +--- + +### 4. Health Data Collection + +**Purpose**: Health records (weight, height, blood pressure, etc.) + +**Schema**: +```javascript +{ + _id: ObjectId("..."), + + // Plaintext fields (for queries) + healthDataId: { type: String, unique: true, required: true }, + userId: { type: String, index: true, required: true }, // Owner + profileId: { type: String, index: true, required: true }, // Subject + familyId: { type: String, index: true }, + + // Encrypted health data (value + metadata) + healthData: [ + { + // Encrypted value + metadata + encrypted: true, + data: String, // Encrypted: { value: 10, type: "blood_pressure", unit: "mmHg", date: "2026-02-14" } + iv: String, + authTag: String + } + ], + + // Metadata (plaintext) + createdAt: { type: Date, default: Date.now }, + updatedAt: { type: Date, default: Date.now }, + dataSource: String // Plaintext (e.g., "manual", "healthKit", "googleFit") +} +``` + +**Example: Blood Pressure Reading**: +```javascript +// Client-side data structure +const healthData = { + value: "120/80", + type: "blood_pressure", + unit: "mmHg", + date: "2026-02-14T10:30:00Z", + notes: "After morning coffee" +}; + +// Client encrypts healthData +const encryptedHealthData = encrypt(healthData, userKey); + +// Stored in MongoDB +{ + _id: ObjectId("..."), + healthDataId: "health-123", + userId: "user-456", + profileId: "profile-789", + familyId: "family-012", + + // Encrypted (value + metadata) + healthData: [ + { + encrypted: true, + data: "a1b2c3d4...", // Contains: value, type, unit, date, notes + iv: "e5f6g7h8...", + authTag: "i9j0k1l2..." + } + ], + + // Metadata (plaintext) + createdAt: ISODate("2026-02-14T10:30:00Z"), + updatedAt: ISODate("2026-02-14T10:30:00Z"), + dataSource: "healthKit" +} +``` + +**Encryption Notes**: +- ✅ **Plaintext**: `healthDataId`, `userId`, `profileId`, `familyId`, `createdAt`, `updatedAt`, `dataSource` +- ❌ **Encrypted**: `healthData[*]` (value + metadata) + +--- + +### 5. Lab Results Collection + +**Purpose**: Lab test results (imported via QR code) + +**Schema**: +```javascript +{ + _id: ObjectId("..."), + + // Plaintext fields (for queries) + labResultId: { type: String, unique: true, required: true }, + userId: { type: String, index: true, required: true }, // Owner + profileId: { type: String, index: true, required: true }, // Subject + familyId: { type: String, index: true }, + + // Encrypted lab data (value + metadata) + labData: { + encrypted: true, + data: String, // Encrypted: { testType: "blood_test", results: [...], date: "...", lab: "..." } + iv: String, + authTag: String + }, + + // Encrypted lab metadata + labMetadata: { + encrypted: true, + data: String, // Encrypted: { labName: "LabCorp", doctor: "Dr. Smith", address: "..." } + iv: String, + authTag: String + }, + + // Metadata (plaintext) + createdAt: { type: Date, default: Date.now }, + updatedAt: { type: Date, default: Date.now }, + dataSource: String // Plaintext (e.g., "qr_code", "manual_entry") +} +``` + +**Example: Blood Test Results**: +```javascript +// Client-side data structure +const labData = { + testType: "blood_panel", + results: [ + { test: "cholesterol", value: 200, unit: "mg/dL", normalRange: "125-200" }, + { test: "glucose", value: 95, unit: "mg/dL", normalRange: "70-100" } + ], + date: "2026-02-14T08:00:00Z", + lab: "LabCorp", + doctor: "Dr. Smith", + notes: "Fasting for 12 hours" +}; + +// Client encrypts labData + labMetadata +const encryptedLabData = encrypt(labData, userKey); +const encryptedLabMetadata = encrypt({ lab: "LabCorp", doctor: "Dr. Smith" }, userKey); + +// Stored in MongoDB +{ + _id: ObjectId("..."), + labResultId: "lab-123", + userId: "user-456", + profileId: "profile-789", + + // Encrypted lab data (value + metadata) + labData: { + encrypted: true, + data: "m3n4o5p6...", + iv: "q7r8s9t0...", + authTag: "u1v2w3x4..." + }, + + // Encrypted lab metadata + labMetadata: { + encrypted: true, + data: "y5z6a7b8...", + iv: "c9d0e1f2...", + authTag: "g3h4i5j6..." + }, + + // Metadata (plaintext) + createdAt: ISODate("2026-02-14T08:00:00Z"), + updatedAt: ISODate("2026-02-14T08:00:00Z"), + dataSource: "qr_code" +} +``` + +**Encryption Notes**: +- ✅ **Plaintext**: `labResultId`, `userId`, `profileId`, `familyId`, `createdAt`, `updatedAt`, `dataSource` +- ❌ **Encrypted**: `labData` (value + metadata), `labMetadata` + +--- + +### 6. Medications Collection + +**Purpose**: Medication tracking and reminders + +**Schema**: +```javascript +{ + _id: ObjectId("..."), + + // Plaintext fields (for queries) + medicationId: { type: String, unique: true, required: true }, + userId: { type: String, index: true, required: true }, // Owner + profileId: { type: String, index: true, required: true }, // Subject + familyId: { type: String, index: true }, + + // Encrypted medication data (value + metadata) + medicationData: { + encrypted: true, + data: String, // Encrypted: { name: "Aspirin", dosage: "100mg", frequency: "daily", shape: "round" } + iv: String, + authTag: String + }, + + // Encrypted reminder schedule + reminderSchedule: { + encrypted: true, + data: String, // Encrypted: { times: ["08:00", "20:00"], days: ["mon", "tue", "wed", "thu", "fri"] } + iv: String, + authTag: String + }, + + // Metadata (plaintext) + active: { type: Boolean, default: true }, + createdAt: { type: Date, default: Date.now }, + updatedAt: { type: Date, default: Date.now } +} +``` + +**Example: Medication**: +```javascript +// Client-side data structure +const medicationData = { + name: "Aspirin", + dosage: "100mg", + frequency: "daily", + shape: "round", + color: "white", + instructions: "Take with water after meals" +}; + +const reminderSchedule = { + times: ["08:00", "20:00"], + days: ["mon", "tue", "wed", "thu", "fri", "sat", "sun"], + notifications: true +}; + +// Client encrypts medicationData + reminderSchedule +const encryptedMedicationData = encrypt(medicationData, userKey); +const encryptedReminderSchedule = encrypt(reminderSchedule, userKey); + +// Stored in MongoDB +{ + _id: ObjectId("..."), + medicationId: "med-123", + userId: "user-456", + profileId: "profile-789", + + // Encrypted medication data (value + metadata) + medicationData: { + encrypted: true, + data: "k9l0m1n2...", + iv: "o3p4q5r6...", + authTag: "s7t8u9v0..." + }, + + // Encrypted reminder schedule + reminderSchedule: { + encrypted: true, + data: "w1x2y3z4...", + iv: "a5b6c7d8...", + authTag: "e9f0g1h2..." + }, + + // Metadata (plaintext) + active: true, + createdAt: ISODate("2026-02-14T10:00:00Z"), + updatedAt: ISODate("2026-02-14T10:00:00Z") +} +``` + +**Encryption Notes**: +- ✅ **Plaintext**: `medicationId`, `userId`, `profileId`, `familyId`, `active`, `createdAt`, `updatedAt` +- ❌ **Encrypted**: `medicationData` (value + metadata), `reminderSchedule` + +--- + +### 7. Appointments Collection + +**Purpose**: Medical appointments and checkups + +**Schema**: +```javascript +{ + _id: ObjectId("..."), + + // Plaintext fields (for queries) + appointmentId: { type: String, unique: true, required: true }, + userId: { type: String, index: true, required: true }, // Owner + profileId: { type: String, index: true, required: true }, // Subject + familyId: { type: String, index: true }, + + // Encrypted appointment data (value + metadata) + appointmentData: { + encrypted: true, + data: String, // Encrypted: { type: "checkup", doctor: "Dr. Smith", date: "...", notes: "..." } + iv: String, + authTag: String + }, + + // Encrypted reminder settings + reminderSettings: { + encrypted: true, + data: String, // Encrypted: { reminder: 24, unit: "hours", method: "push_notification" } + iv: String, + authTag: String + }, + + // Metadata (plaintext) + createdAt: { type: Date, default: Date.now }, + updatedAt: { type: Date, default: Date.now } +} +``` + +**Encryption Notes**: +- ✅ **Plaintext**: `appointmentId`, `userId`, `profileId`, `familyId`, `createdAt`, `updatedAt` +- ❌ **Encrypted**: `appointmentData` (value + metadata), `reminderSettings` + +--- + +### 8. Shares Collection + +**Purpose**: Time-limited access to shared data (from encryption.md) + +**Schema**: +```javascript +{ + _id: ObjectId("..."), + + // Plaintext fields (for queries) + shareId: { type: String, unique: true, required: true }, + userId: { type: String, index: true, required: true }, // Owner + + // References to original data + documentId: { type: String, required: true }, + collectionName: { type: String, required: true }, // health_data, lab_results, etc. + + // Encrypted shared data (encrypted with share-specific password) + encryptedData: { + encrypted: true, + data: String, // Encrypted with share-specific password + iv: String, + authTag: String + }, + + // Metadata (plaintext) + createdAt: { type: Date, default: Date.now }, + expiresAt: { type: Date, index: true }, + accessCount: { type: Number, default: 0 }, + maxAccessCount: { type: Number }, + + // Optional: Additional security + allowedEmails: [String], + isRevoked: { type: Boolean, default: false }, + revokedAt: Date +} +``` + +**Encryption Notes**: +- ✅ **Plaintext**: `shareId`, `userId`, `documentId`, `collectionName`, `createdAt`, `expiresAt`, `accessCount`, `maxAccessCount`, `allowedEmails`, `isRevoked`, `revokedAt` +- ❌ **Encrypted**: `encryptedData` (encrypted with share-specific password) + +--- + +### 9. Refresh Tokens Collection + +**Purpose**: JWT refresh token storage (from JWT authentication) + +**Schema**: +```javascript +{ + _id: ObjectId("..."), + + // Plaintext fields (for queries) + jti: { type: String, unique: true, required: true }, // JWT ID + userId: { type: String, index: true, required: true }, + + // Metadata (plaintext) + createdAt: { type: Date, default: Date.now }, + expiresAt: { type: Date, index: true, required: true }, + revoked: { type: Boolean, default: false }, + revokedAt: Date +} +``` + +**Encryption Notes**: +- ✅ **Plaintext**: `jti`, `userId`, `createdAt`, `expiresAt`, `revoked`, `revokedAt` +- ❌ **Encrypted**: None (refresh tokens are not sensitive data) + +--- + +## Encryption Strategy + +### Client-Side Encryption (Before Sending to Server) + +**Encryption Flow**: +```javascript +// 1. User enters health data +const healthData = { + value: "120/80", + type: "blood_pressure", + unit: "mmHg", + date: "2026-02-14T10:30:00Z" +}; + +// 2. Client derives encryption key from password +const userKey = await deriveKeyFromPassword(userPassword); + +// PBKDF2: 100,000 iterations, SHA-256, 32-byte key + +// 3. Client encrypts health data +const encryptedHealthData = await encryptData(healthData, userKey); +// AES-256-GCM: 16-byte IV, auth tag for integrity + +// 4. Client sends encrypted data to server +await fetch('/api/health-data', { + method: 'POST', + body: JSON.stringify({ + userId: 'user-123', + profileId: 'profile-456', + familyId: 'family-789', + healthData: [encryptedHealthData] // Encrypted (value + metadata) + }) +}); + +// 5. Server stores encrypted data in MongoDB +// Server NEVER decrypts data +``` + +### Encryption Implementation (Client-Side) + +**React Native / React**: +```typescript +import * as crypto from 'crypto'; + +// Encrypted field structure +interface EncryptedField { + encrypted: true; + data: string; // Encrypted data + iv: string; // Initialization vector + authTag: string; // Authentication tag (AES-256-GCM) +} + +// Encrypt data +async function encryptData(data: any, key: Buffer): Promise { + const iv = crypto.randomBytes(16); + const cipher = crypto.createCipheriv('aes-256-gcm', key, iv); + + let encrypted = cipher.update(JSON.stringify(data), 'utf8', 'hex'); + encrypted += cipher.final('hex'); + const authTag = cipher.getAuthTag(); + + return { + encrypted: true, + data: encrypted, + iv: iv.toString('hex'), + authTag: authTag.toString('hex') + }; +} + +// Decrypt data +async function decryptData(encryptedField: EncryptedField, key: Buffer): Promise { + const decipher = crypto.createDecipheriv( + 'aes-256-gcm', + key, + Buffer.from(encryptedField.iv, 'hex') + ); + decipher.setAuthTag(Buffer.from(encryptedField.authTag, 'hex')); + + let decrypted = decipher.update(encryptedField.data, 'hex', 'utf8'); + decrypted += decipher.final('utf8'); + + return JSON.parse(decrypted); +} + +// Derive key from password +async function deriveKeyFromPassword(password: string): Promise { + const salt = crypto.randomBytes(32); + return new Promise((resolve, reject) => { + crypto.pbkdf2(password, salt, 100000, 32, 'sha256', (err, derivedKey) => { + if (err) reject(err); + else resolve(derivedKey); + }); + }); +} +``` + +--- + +## Indexing Strategy + +### Principle +**Index ONLY plaintext fields** (for performance and privacy). + +### Indexes per Collection + +#### Users +```javascript +db.users.createIndex({ userId: 1 }, { unique: true }); +db.users.createIndex({ email: 1 }, { unique: true }); +db.users.createIndex({ familyId: 1 }); +db.users.createIndex({ createdAt: -1 }); +``` + +#### Families +```javascript +db.families.createIndex({ familyId: 1 }, { unique: true }); +db.families.createIndex({ createdAt: -1 }); +``` + +#### Profiles +```javascript +db.profiles.createIndex({ profileId: 1 }, { unique: true }); +db.profiles.createIndex({ userId: 1 }); +db.profiles.createIndex({ familyId: 1 }); +db.profiles.createIndex({ createdAt: -1 }); +``` + +#### Health Data +```javascript +db.health_data.createIndex({ healthDataId: 1 }, { unique: true }); +db.health_data.createIndex({ userId: 1 }); +db.health_data.createIndex({ profileId: 1 }); +db.health_data.createIndex({ familyId: 1 }); +db.health_data.createIndex({ createdAt: -1 }); +db.health_data.createIndex({ updatedAt: -1 }); +``` + +#### Lab Results +```javascript +db.lab_results.createIndex({ labResultId: 1 }, { unique: true }); +db.lab_results.createIndex({ userId: 1 }); +db.lab_results.createIndex({ profileId: 1 }); +db.lab_results.createIndex({ familyId: 1 }); +db.lab_results.createIndex({ createdAt: -1 }); +db.lab_results.createIndex({ updatedAt: -1 }); +``` + +#### Medications +```javascript +db.medications.createIndex({ medicationId: 1 }, { unique: true }); +db.medications.createIndex({ userId: 1 }); +db.medications.createIndex({ profileId: 1 }); +db.medications.createIndex({ familyId: 1 }); +db.medications.createIndex({ active: 1 }); +db.medications.createIndex({ createdAt: -1 }); +db.medications.createIndex({ updatedAt: -1 }); +``` + +#### Appointments +```javascript +db.appointments.createIndex({ appointmentId: 1 }, { unique: true }); +db.appointments.createIndex({ userId: 1 }); +db.appointments.createIndex({ profileId: 1 }); +db.appointments.createIndex({ familyId: 1 }); +db.appointments.createIndex({ createdAt: -1 }); +db.appointments.createIndex({ updatedAt: -1 }); +``` + +#### Shares +```javascript +db.shares.createIndex({ shareId: 1 }, { unique: true }); +db.shares.createIndex({ userId: 1 }); +db.shares.createIndex({ expiresAt: 1 }); // For TTL index +db.shares.createIndex({ createdAt: -1 }); +db.shares.createIndex({ isRevoked: 1 }); +``` + +#### Refresh Tokens +```javascript +db.refresh_tokens.createIndex({ jti: 1 }, { unique: true }); +db.refresh_tokens.createIndex({ userId: 1 }); +db.refresh_tokens.createIndex({ expiresAt: 1 }); // For TTL index +db.refresh_tokens.createIndex({ revoked: 1 }); +``` + +### TTL Indexes (Auto-Expiration) + +```javascript +// Shares: Auto-delete expired shares +db.shares.createIndex( + { expiresAt: 1 }, + { expireAfterSeconds: 0 } // Delete immediately after expiration +); + +// Refresh Tokens: Auto-delete expired tokens +db.refresh_tokens.createIndex( + { expiresAt: 1 }, + { expireAfterSeconds: 0 } // Delete immediately after expiration +); +``` + +--- + +## Privacy-Preserving Queries + +### Challenge +**How to query encrypted data without decrypting it?** + +### Solutions + +#### 1. Plaintext Queries (Recommended) + +**Query by plaintext fields only**: +```javascript +// GOOD: Query by plaintext fields +const healthData = await db.health_data.find({ + userId: 'user-123', // Plaintext ✅ + profileId: 'profile-456', // Plaintext ✅ + familyId: 'family-789' // Plaintext ✅ +}).toArray(); + +// Client decrypts healthData[i].healthData[j] +``` + +#### 2. Tagging System (Encrypted Search) + +**Client adds searchable tags to encrypted data**: +```javascript +// Client adds tags to encrypted data +const healthData = { + value: "120/80", + type: "blood_pressure", + unit: "mmHg", + date: "2026-02-14T10:30:00Z", + tags: ["cardio", "daily"] // Plaintext tags (for client-side search) +}; + +// Stored in MongoDB +{ + _id: ObjectId("..."), + healthDataId: "health-123", + userId: "user-123", + profileId: "profile-456", + + // Encrypted health data + healthData: [ + { + encrypted: true, + data: "a1b2c3d4...", + iv: "e5f6g7h8...", + authTag: "i9j0k1l2..." + } + ], + + // Plaintext tags (for client-side search) + tags: ["cardio", "daily"] +} + +// Query by tags +const healthData = await db.health_data.find({ + userId: 'user-123', + tags: { $in: ['cardio', 'daily'] } // Plaintext tags ✅ +}).toArray(); + +// Client decrypts healthData[i].healthData[j] +``` + +#### 3. Date Range Queries (Plaintext Dates) + +**Store dates as plaintext** (for range queries): +```javascript +// Client encrypts health data BUT stores date as plaintext +const healthData = { + value: "120/80", + type: "blood_pressure", + unit: "mmHg", + date: "2026-02-14T10:30:00Z" // Plaintext date ✅ +}; + +// Stored in MongoDB +{ + _id: ObjectId("..."), + healthDataId: "health-123", + userId: "user-123", + profileId: "profile-456", + + // Plaintext date (for range queries) + date: ISODate("2026-02-14T10:30:00Z"), // Plaintext ✅ + + // Encrypted health data (without date) + healthData: [ + { + encrypted: true, + data: "a1b2c3d4...", // Encrypted: { value, type, unit } (no date) + iv: "e5f6g7h8...", + authTag: "i9j0k1l2..." + } + ] +} + +// Query by date range +const healthData = await db.health_data.find({ + userId: 'user-123', + date: { + $gte: ISODate("2026-02-01T00:00:00Z"), + $lte: ISODate("2026-02-28T23:59:59Z") + } +}).toArray(); + +// Client decrypts healthData[i].healthData[j] +``` + +--- + +## Data Migration + +### Key Rotation + +**Strategy**: Re-encrypt all data with new key + +```javascript +// 1. User changes password +const newPassword = "new-secure-password"; +const newKey = await deriveKeyFromPassword(newPassword); + +// 2. Client fetches all encrypted data +const healthData = await db.health_data.find({ userId: 'user-123' }).toArray(); +// 3. Client decrypts with old key +const decryptedHealthData = healthData.map(d => ({ + _id: d._id, + decrypted: await decryptData(d.healthData[0], oldKey) +})); +// 4. Client re-encrypts with new key +const reencryptedHealthData = decryptedHealthData.map(d => ({ + _id: d._id, + encrypted: await encryptData(d.decrypted, newKey) +})); +// 5. Client sends re-encrypted data to server +for (const d of reencryptedHealthData) { + await db.health_data.updateOne( + { _id: d._id }, + { $set: { healthData: [d.encrypted], updatedAt: new Date() } } + ); +} +``` + +--- + +## Performance Considerations + +### 1. Encryption Overhead +- **Client-side encryption**: Minimal (10-50ms per encryption) +- **Server-side storage**: No overhead (encrypted data stored directly) +- **Network transfer**: Encrypted data is 20-30% larger than plaintext + + +### 2. Index Size +- **Plaintext indexes**: Smaller (only plaintext fields) +- **Encrypted data**: Not indexed (no performance impact) + +### 3. Query Performance +- **Plaintext queries**: Fast (indexed fields) +- **Tag-based queries**: Fast (indexed plaintext tags) +- **Date range queries**: Fast (indexed plaintext dates) +- **Encrypted data queries**: Not possible (client-side filtering) + +### 4. Storage Size +- **Encrypted data**: 20-30% larger than plaintext +- **MongoDB storage**: No impact (stores binary data) + +--- + +## Summary + +### Zero-Knowledge Encryption +- ✅ **Client-side encryption**: All sensitive data encrypted before reaching server +- ✅ **Metadata encryption**: Health data metadata (type, unit, etc.) also encrypted +- ✅ **Plaintext queries**: Query by plaintext fields (userId, profileId, familyId, date, tags) +- ✅ **Server blindness**: Server stores ONLY encrypted data, never decrypts + +### Collections +- ✅ **Users**: Authentication, profiles, family relationships +- ✅ **Families**: Family structure, encrypted family name/metadata +- ✅ **Profiles**: Person profiles, encrypted profile name/metadata +- ✅ **Health Data**: Encrypted health records (value + metadata) +- ✅ **Lab Results**: Encrypted lab data (value + metadata) +- ✅ **Medications**: Encrypted medication data + reminders +- ✅ **Appointments**: Encrypted appointment data + reminders +- ✅ **Shares**: Time-limited access to shared data +- ✅ **Refresh Tokens**: JWT refresh token storage + +### Privacy Preserved +- ✅ **Blood pressure**: Value + type + unit + date encrypted +- ✅ **HIV test**: Test type + result + date + doctor encrypted +- ✅ **Cholesterol**: Test type + result + date + lab encrypted +- ✅ **All health data**: Value + metadata encrypted +--- + +## Next Steps + +1. Create MongoDB indexes +2. Implement client-side encryption (React Native + React) +3. Implement server-side API (Axum + MongoDB) +4. Test encryption flow +5. Test data migration (key rotation) +6. Test privacy-preserving queries +7. Performance testing + +--- + +## References + +- [Normogen Encryption Guide](../encryption.md) +- [JWT Authentication Research](./2026-02-14-jwt-authentication-research.md) +- [Technology Stack Decisions](./2026-02-14-tech-stack-decision.md) +- [MongoDB Documentation](https://docs.mongodb.com/) +- [AES-256-GCM](https://en.wikipedia.org/wiki/Galois/Counter_Mode) +- [PBKDF2](https://en.wikipedia.org/wiki/PBKDF2) diff --git a/thoughts/research/2026-02-14-tech-stack-decision.md b/thoughts/research/2026-02-14-tech-stack-decision.md index 0e2dbbd..7efba0e 100644 --- a/thoughts/research/2026-02-14-tech-stack-decision.md +++ b/thoughts/research/2026-02-14-tech-stack-decision.md @@ -96,6 +96,82 @@ --- +### 6. Database: MongoDB with Zero-Knowledge Encryption +**Decision**: MongoDB 6.0+ with client-side encryption (ALL sensitive data + metadata) + +**Score**: 9.8/10 + +**Core Principle**: **ALL sensitive data AND metadata must be encrypted client-side before reaching MongoDB** + +**Example: Blood Pressure Reading**: +```javascript +// Before encryption (client-side) +{ + value: "120/80", + type: "blood_pressure", + unit: "mmHg", + date: "2026-02-14T10:30:00Z" +} + +// After encryption (stored in MongoDB) +{ + healthDataId: "health-123", + userId: "user-456", + profileId: "profile-789", + + // Encrypted (value + metadata) + healthData: [ + { + encrypted: true, + data: "a1b2c3d4...", + iv: "e5f6g7h8...", + authTag: "i9j0k1l2..." + } + ] +} +``` + +**Rationale**: +- **Zero-knowledge**: Server NEVER decrypts data +- **Metadata encryption**: Health data type, unit, doctor, lab ALL encrypted +- **Privacy-preserving**: No plaintext metadata leaks +- **Client-side encryption**: AES-256-GCM, PBKDF2 key derivation +- **Plaintext queries**: Query by userId, profileId, familyId, date, tags +- **Tagging system**: Client adds searchable tags for encrypted data +- **Flexible schema**: MongoDB document structure fits health data +- **Scalable**: Horizontal scaling with sharding + +**Collections**: +- **users**: Authentication, profiles, family relationships +- **families**: Family structure, encrypted family name/metadata +- **profiles**: Person profiles, encrypted profile name/metadata +- **health_data**: Encrypted health records (value + metadata) +- **lab_results**: Encrypted lab data (value + metadata) +- **medications**: Encrypted medication data + reminders +- **appointments**: Encrypted appointment data + reminders +- **shares**: Time-limited access to shared data +- **refresh_tokens**: JWT refresh token storage + +**What Must Be Encrypted**: +- ✅ Health data values (e.g., "120/80") +- ✅ Health data metadata (e.g., "blood_pressure", "mmHg") +- ✅ Lab test results (e.g., "cholesterol", "200", "LabCorp") +- ✅ Medication data (e.g., "Aspirin", "100mg", "daily") +- ✅ Appointment data (e.g., "checkup", "Dr. Smith") +- ✅ Profile data (e.g., "John Doe", "1990-01-01") +- ✅ Family data (e.g., "Smith Family", "123 Main St") + +**What Can Be Plaintext**: +- ✅ User IDs (userId, profileId, familyId) - for queries +- ✅ Email addresses - for authentication +- ✅ Dates (createdAt, updatedAt) - for sorting +- ✅ Data sources (healthKit, googleFit) - for analytics +- ✅ Tags (cardio, daily) - for client-side search + +**Reference**: [2026-02-14-mongodb-schema-design-research.md](./2026-02-14-mongodb-schema-design-research.md) + +--- + ## Technology Stack Summary ### Backend @@ -103,7 +179,7 @@ - **Runtime**: Tokio 1.x - **Middleware**: Tower, Tower-HTTP - **Authentication**: JWT with refresh tokens -- **Database**: MongoDB (with zero-knowledge encryption) +- **Database**: MongoDB 6.0+ (with zero-knowledge encryption) - **Language**: Rust ### Mobile (iOS + Android) @@ -145,53 +221,96 @@ --- -## Still To Be Decided +## All Major Decisions Complete ✅ -### 1. Database Schema (Priority: High) - -**Collections to Design**: -- Users (authentication, profiles) -- Families (family structure) -- Health Data (encrypted health records) -- Lab Results (encrypted lab data) -- Medications (encrypted medication data) -- Appointments (encrypted appointment data) -- Shared Links (time-limited access tokens) -- Refresh Tokens (JWT refresh token storage) +1. ✅ Rust Framework: Axum 0.7.x +2. ✅ Mobile Framework: React Native 0.73+ +3. ✅ Web Framework: React 18+ +4. ✅ State Management: Redux Toolkit 2.x +5. ✅ Authentication: JWT with refresh tokens +6. ✅ Database: MongoDB 6.0+ with zero-knowledge encryption --- -### 2. API Architecture (Priority: Medium) +## Implementation Phases -**Options**: -- REST (current plan) -- GraphQL (alternative) -- gRPC (for microservices) +- **Phase 1: Research** (COMPLETE) + - Rust framework selection + - Mobile/web framework selection + - State management selection + - Authentication design + - Database schema design + +- **Phase 2: Backend Development** (NEXT) + - Axum server setup + - MongoDB connection + - JWT authentication + - CRUD API endpoints + - Zero-knowledge encryption (client-side) + +- **Phase 3: Mobile Development** (AFTER BACKEND) + - React Native app setup + - Redux Toolkit setup + - JWT authentication + - Health sensor integration + - QR code scanning + - Encryption implementation + +- **Phase 4: Web Development** (PARALLEL WITH MOBILE) + - React app setup + - Redux Toolkit setup + - JWT authentication + - Charts and visualizations + - Profile management + - Encryption implementation --- -## Recommended Order +## Next Steps -1. Rust Framework: Axum (COMPLETED) -2. Mobile/Web Framework: React Native + React (COMPLETED) -3. State Management: Redux Toolkit 2.x (COMPLETED) -4. Authentication: JWT with refresh tokens (COMPLETED) -5. Database Schema: Design MongoDB collections (NEXT) -6. Create POC: Health sensor integration test -7. Implement Core Features: Authentication, encryption, CRUD +1. **Backend Development** (Axum + MongoDB) + - Create Axum server + - Setup MongoDB connection + - Implement JWT authentication + - Create MongoDB indexes + - Implement CRUD endpoints + +2. **Client-Side Encryption** (React Native + React) + - Implement AES-256-GCM encryption + - Implement PBKDF2 key derivation + - Create encryption utilities + - Test encryption flow + +3. **API Development** (Axum) + - Users API (register, login, logout) + - Families API (create, update, delete) + - Profiles API (CRUD) + - Health Data API (CRUD) + - Lab Results API (import via QR) + - Medications API (reminders) + - Appointments API (reminders) + - Shares API (time-limited access) --- -## Next Research Priority +## Timeline Estimate -**Research Question**: What should the MongoDB schema look like for Normogen's encrypted health data platform? +- **Phase 1: Research** (COMPLETE) +- **Phase 2: Backend Development** (8-10 weeks) +- **Phase 3: Mobile Development** (8-12 weeks) +- **Phase 4: Web Development** (4-6 weeks) +- **Phase 5: Testing & Polish** (4-6 weeks) -**Considerations**: -- Zero-knowledge encryption (all sensitive data encrypted) -- Family structure (parents, children, elderly) -- Health data types (lab results, medications, appointments) -- Refresh tokens (JWT storage) -- Shared links (time-limited access) -- Permissions (family member access control) +**Total**: 24-34 weeks (6-8.5 months) -**Estimated Research Time**: 3-4 hours +--- + +## References + +- [Axum Performance Research](./2026-02-14-performance-findings.md) +- [Frontend Mobile Research](./2026-02-14-frontend-mobile-research.md) +- [State Management Research](./2026-02-14-state-management-research.md) +- [JWT Authentication Research](./2026-02-14-jwt-authentication-research.md) +- [MongoDB Schema Design](./2026-02-14-mongodb-schema-design-research.md) +- [Normogen Encryption Guide](../encryption.md) +- [Project Introduction](../introduction.md)