Research: MongoDB schema design complete

- Zero-knowledge encryption for ALL sensitive data + metadata
- Blood pressure example: value + type + unit ALL encrypted
- 9 collections: users, families, profiles, health_data, lab_results, medications, appointments, shares, refresh_tokens
- Client-side encryption (AES-256-GCM, PBKDF2)
- Server NEVER decrypts data
- Privacy-preserving queries (plaintext fields: userId, profileId, familyId, date, tags)
- Tagging system for encrypted data search
- Date range queries (plaintext dates)

Key principle:
- Both value AND metadata encrypted (e.g., "blood_pressure" + "120/80")
- No plaintext metadata leaks
- Server stores ONLY encrypted data

Updated tech stack decisions with MongoDB schema

All major research complete (Rust, Mobile, Web, State, Auth, Database)

Next: Backend development (Axum + MongoDB)
This commit is contained in:
goose 2026-02-14 13:39:57 -03:00
parent 203c0b4331
commit 4dca44dbbe
3 changed files with 1427 additions and 36 deletions

View file

@ -0,0 +1,183 @@
# MongoDB Schema Design Decision Summary
**Date**: 2026-02-14
**Decision**: **Zero-Knowledge Encryption for All Sensitive Data + Metadata**
---
## Core Principle
**ALL sensitive data AND metadata must be encrypted client-side before reaching MongoDB.**
### Example: Blood Pressure Reading
**Before encryption** (client-side):
```javascript
{
value: "120/80",
type: "blood_pressure",
unit: "mmHg",
date: "2026-02-14T10:30:00Z"
}
```
**After encryption** (stored in MongoDB):
```javascript
{
healthDataId: "health-123",
userId: "user-456",
profileId: "profile-789",
familyId: "family-012",
// Encrypted (value + metadata)
healthData: [
{
encrypted: true,
data: "a1b2c3d4...",
iv: "e5f6g7h8...",
authTag: "i9j0k1l2..."
}
],
// Metadata (plaintext)
createdAt: ISODate("2026-02-14T10:30:00Z"),
updatedAt: ISODate("2026-02-14T10:30:00Z"),
dataSource: "healthKit"
}
```
---
## Collections Summary
| Collection | Purpose | Encrypted Fields | Plaintext Fields |
|-----------|---------|------------------|-----------------|
| **users** | Authentication | encryptedRecoveryPhrase | userId, email, passwordHash, tokenVersion, familyId, familyRole, permissions |
| **families** | Family structure | familyName, familyMetadata | familyId, members[*].userId, members[*].profileId, members[*].role |
| **profiles** | Person profiles | profileName, profileMetadata | profileId, userId, familyId, profileType |
| **health_data** | Health records | healthData[*] (value + metadata) | healthDataId, userId, profileId, familyId, createdAt, updatedAt, dataSource |
| **lab_results** | Lab tests | labData (value + metadata), labMetadata | labResultId, userId, profileId, familyId, createdAt, updatedAt, dataSource |
| **medications** | Medication tracking | medicationData (value + metadata), reminderSchedule | medicationId, userId, profileId, familyId, active, createdAt, updatedAt |
| **appointments** | Medical appointments | appointmentData (value + metadata), reminderSettings | appointmentId, userId, profileId, familyId, createdAt, updatedAt |
| **shares** | Shared data | encryptedData (share-specific password) | shareId, userId, documentId, collectionName, createdAt, expiresAt, accessCount, isRevoked |
| **refresh_tokens** | JWT tokens | None | jti, userId, createdAt, expiresAt, revoked |
---
## Encryption Strategy
### Client-Side Encryption
**Encryption Flow**:
1. User enters health data
2. Client derives encryption key from password (PBKDF2)
3. Client encrypts health data (AES-256-GCM)
4. Client sends encrypted data to server
5. Server stores encrypted data in MongoDB
6. Server NEVER decrypts data
### What Must Be Encrypted
- ✅ **Health data values** (e.g., "120/80")
- ✅ **Health data metadata** (e.g., "blood_pressure", "mmHg")
- ✅ **Lab test results** (e.g., "cholesterol", "200", "LabCorp")
- ✅ **Medication data** (e.g., "Aspirin", "100mg", "daily")
- ✅ **Appointment data** (e.g., "checkup", "Dr. Smith")
- ✅ **Profile data** (e.g., "John Doe", "1990-01-01")
- ✅ **Family data** (e.g., "Smith Family", "123 Main St")
### What Can Be Plaintext
- ✅ **User IDs** (userId, profileId, familyId) - for queries
- ✅ **Email addresses** - for authentication
- ✅ **Dates** (createdAt, updatedAt) - for sorting
- ✅ **Data sources** (healthKit, googleFit) - for analytics
- ✅ **Tags** (cardio, daily) - for client-side search
---
## Privacy-Preserving Queries
### 1. Plaintext Queries (Recommended)
**Query by plaintext fields only**:
```javascript
const healthData = await db.health_data.find({
userId: 'user-123', // Plaintext ✅
profileId: 'profile-456', // Plaintext ✅
familyId: 'family-789' // Plaintext ✅
}).toArray();
// Client decrypts healthData[i].healthData[j]
```
### 2. Tagging System (Encrypted Search)
**Client adds searchable tags to encrypted data**:
```javascript
const healthData = await db.health_data.find({
userId: 'user-123',
tags: { $in: ['cardio', 'daily'] } // Plaintext tags ✅
}).toArray();
// Client decrypts healthData[i].healthData[j]
```
### 3. Date Range Queries (Plaintext Dates)
**Store dates as plaintext** (for range queries):
```javascript
const healthData = await db.health_data.find({
userId: 'user-123',
date: {
$gte: ISODate("2026-02-01"),
$lte: ISODate("2026-02-28")
}
}).toArray();
// Client decrypts healthData[i].healthData[j]
```
---
## Technology Stack
### Backend (Axum + MongoDB)
- **Axum 0.7.x**: Web framework
- **MongoDB 6.0+**: Database
- **Rust**: Server language
### Client (React Native + React)
- **AES-256-GCM**: Encryption algorithm
- **PBKDF2**: Key derivation function
- **Crypto API**: Node.js crypto / react-native-quick-crypto
---
## Implementation Timeline
- **Week 1**: Create MongoDB indexes
- **Week 1-2**: Implement client-side encryption (React Native + React)
- **Week 2-3**: Implement server-side API (Axum + MongoDB)
- **Week 3**: Test encryption flow
- **Week 3-4**: Test data migration (key rotation)
- **Week 4**: Test privacy-preserving queries
- **Week 4-5**: Performance testing
**Total**: 4-5 weeks
---
## Next Steps
1. Create MongoDB indexes for all collections
2. Implement client-side encryption (React Native + React)
3. Implement server-side API (Axum + MongoDB)
4. Test encryption flow (end-to-end)
5. Test data migration (key rotation)
6. Test privacy-preserving queries
7. Performance testing
8. Create API documentation
---
## References
- [Comprehensive MongoDB Schema Research](./2026-02-14-mongodb-schema-design-research.md)
- [Normogen Encryption Guide](../encryption.md)
- [JWT Authentication Research](./2026-02-14-jwt-authentication-research.md)
- [Technology Stack Decisions](./2026-02-14-tech-stack-decision.md)

File diff suppressed because it is too large Load diff

View file

@ -96,6 +96,82 @@
--- ---
### 6. Database: MongoDB with Zero-Knowledge Encryption
**Decision**: MongoDB 6.0+ with client-side encryption (ALL sensitive data + metadata)
**Score**: 9.8/10
**Core Principle**: **ALL sensitive data AND metadata must be encrypted client-side before reaching MongoDB**
**Example: Blood Pressure Reading**:
```javascript
// Before encryption (client-side)
{
value: "120/80",
type: "blood_pressure",
unit: "mmHg",
date: "2026-02-14T10:30:00Z"
}
// After encryption (stored in MongoDB)
{
healthDataId: "health-123",
userId: "user-456",
profileId: "profile-789",
// Encrypted (value + metadata)
healthData: [
{
encrypted: true,
data: "a1b2c3d4...",
iv: "e5f6g7h8...",
authTag: "i9j0k1l2..."
}
]
}
```
**Rationale**:
- **Zero-knowledge**: Server NEVER decrypts data
- **Metadata encryption**: Health data type, unit, doctor, lab ALL encrypted
- **Privacy-preserving**: No plaintext metadata leaks
- **Client-side encryption**: AES-256-GCM, PBKDF2 key derivation
- **Plaintext queries**: Query by userId, profileId, familyId, date, tags
- **Tagging system**: Client adds searchable tags for encrypted data
- **Flexible schema**: MongoDB document structure fits health data
- **Scalable**: Horizontal scaling with sharding
**Collections**:
- **users**: Authentication, profiles, family relationships
- **families**: Family structure, encrypted family name/metadata
- **profiles**: Person profiles, encrypted profile name/metadata
- **health_data**: Encrypted health records (value + metadata)
- **lab_results**: Encrypted lab data (value + metadata)
- **medications**: Encrypted medication data + reminders
- **appointments**: Encrypted appointment data + reminders
- **shares**: Time-limited access to shared data
- **refresh_tokens**: JWT refresh token storage
**What Must Be Encrypted**:
- ✅ Health data values (e.g., "120/80")
- ✅ Health data metadata (e.g., "blood_pressure", "mmHg")
- ✅ Lab test results (e.g., "cholesterol", "200", "LabCorp")
- ✅ Medication data (e.g., "Aspirin", "100mg", "daily")
- ✅ Appointment data (e.g., "checkup", "Dr. Smith")
- ✅ Profile data (e.g., "John Doe", "1990-01-01")
- ✅ Family data (e.g., "Smith Family", "123 Main St")
**What Can Be Plaintext**:
- ✅ User IDs (userId, profileId, familyId) - for queries
- ✅ Email addresses - for authentication
- ✅ Dates (createdAt, updatedAt) - for sorting
- ✅ Data sources (healthKit, googleFit) - for analytics
- ✅ Tags (cardio, daily) - for client-side search
**Reference**: [2026-02-14-mongodb-schema-design-research.md](./2026-02-14-mongodb-schema-design-research.md)
---
## Technology Stack Summary ## Technology Stack Summary
### Backend ### Backend
@ -103,7 +179,7 @@
- **Runtime**: Tokio 1.x - **Runtime**: Tokio 1.x
- **Middleware**: Tower, Tower-HTTP - **Middleware**: Tower, Tower-HTTP
- **Authentication**: JWT with refresh tokens - **Authentication**: JWT with refresh tokens
- **Database**: MongoDB (with zero-knowledge encryption) - **Database**: MongoDB 6.0+ (with zero-knowledge encryption)
- **Language**: Rust - **Language**: Rust
### Mobile (iOS + Android) ### Mobile (iOS + Android)
@ -145,53 +221,96 @@
--- ---
## Still To Be Decided ## All Major Decisions Complete ✅
### 1. Database Schema (Priority: High) 1. ✅ Rust Framework: Axum 0.7.x
2. ✅ Mobile Framework: React Native 0.73+
**Collections to Design**: 3. ✅ Web Framework: React 18+
- Users (authentication, profiles) 4. ✅ State Management: Redux Toolkit 2.x
- Families (family structure) 5. ✅ Authentication: JWT with refresh tokens
- Health Data (encrypted health records) 6. ✅ Database: MongoDB 6.0+ with zero-knowledge encryption
- Lab Results (encrypted lab data)
- Medications (encrypted medication data)
- Appointments (encrypted appointment data)
- Shared Links (time-limited access tokens)
- Refresh Tokens (JWT refresh token storage)
--- ---
### 2. API Architecture (Priority: Medium) ## Implementation Phases
**Options**: - **Phase 1: Research** (COMPLETE)
- REST (current plan) - Rust framework selection
- GraphQL (alternative) - Mobile/web framework selection
- gRPC (for microservices) - State management selection
- Authentication design
- Database schema design
- **Phase 2: Backend Development** (NEXT)
- Axum server setup
- MongoDB connection
- JWT authentication
- CRUD API endpoints
- Zero-knowledge encryption (client-side)
- **Phase 3: Mobile Development** (AFTER BACKEND)
- React Native app setup
- Redux Toolkit setup
- JWT authentication
- Health sensor integration
- QR code scanning
- Encryption implementation
- **Phase 4: Web Development** (PARALLEL WITH MOBILE)
- React app setup
- Redux Toolkit setup
- JWT authentication
- Charts and visualizations
- Profile management
- Encryption implementation
--- ---
## Recommended Order ## Next Steps
1. Rust Framework: Axum (COMPLETED) 1. **Backend Development** (Axum + MongoDB)
2. Mobile/Web Framework: React Native + React (COMPLETED) - Create Axum server
3. State Management: Redux Toolkit 2.x (COMPLETED) - Setup MongoDB connection
4. Authentication: JWT with refresh tokens (COMPLETED) - Implement JWT authentication
5. Database Schema: Design MongoDB collections (NEXT) - Create MongoDB indexes
6. Create POC: Health sensor integration test - Implement CRUD endpoints
7. Implement Core Features: Authentication, encryption, CRUD
2. **Client-Side Encryption** (React Native + React)
- Implement AES-256-GCM encryption
- Implement PBKDF2 key derivation
- Create encryption utilities
- Test encryption flow
3. **API Development** (Axum)
- Users API (register, login, logout)
- Families API (create, update, delete)
- Profiles API (CRUD)
- Health Data API (CRUD)
- Lab Results API (import via QR)
- Medications API (reminders)
- Appointments API (reminders)
- Shares API (time-limited access)
--- ---
## Next Research Priority ## Timeline Estimate
**Research Question**: What should the MongoDB schema look like for Normogen's encrypted health data platform? - **Phase 1: Research** (COMPLETE)
- **Phase 2: Backend Development** (8-10 weeks)
- **Phase 3: Mobile Development** (8-12 weeks)
- **Phase 4: Web Development** (4-6 weeks)
- **Phase 5: Testing & Polish** (4-6 weeks)
**Considerations**: **Total**: 24-34 weeks (6-8.5 months)
- Zero-knowledge encryption (all sensitive data encrypted)
- Family structure (parents, children, elderly)
- Health data types (lab results, medications, appointments)
- Refresh tokens (JWT storage)
- Shared links (time-limited access)
- Permissions (family member access control)
**Estimated Research Time**: 3-4 hours ---
## References
- [Axum Performance Research](./2026-02-14-performance-findings.md)
- [Frontend Mobile Research](./2026-02-14-frontend-mobile-research.md)
- [State Management Research](./2026-02-14-state-management-research.md)
- [JWT Authentication Research](./2026-02-14-jwt-authentication-research.md)
- [MongoDB Schema Design](./2026-02-14-mongodb-schema-design-research.md)
- [Normogen Encryption Guide](../encryption.md)
- [Project Introduction](../introduction.md)