Research: MongoDB schema design complete
- Zero-knowledge encryption for ALL sensitive data + metadata - Blood pressure example: value + type + unit ALL encrypted - 9 collections: users, families, profiles, health_data, lab_results, medications, appointments, shares, refresh_tokens - Client-side encryption (AES-256-GCM, PBKDF2) - Server NEVER decrypts data - Privacy-preserving queries (plaintext fields: userId, profileId, familyId, date, tags) - Tagging system for encrypted data search - Date range queries (plaintext dates) Key principle: - Both value AND metadata encrypted (e.g., "blood_pressure" + "120/80") - No plaintext metadata leaks - Server stores ONLY encrypted data Updated tech stack decisions with MongoDB schema All major research complete (Rust, Mobile, Web, State, Auth, Database) Next: Backend development (Axum + MongoDB)
This commit is contained in:
parent
203c0b4331
commit
4dca44dbbe
3 changed files with 1427 additions and 36 deletions
183
thoughts/research/2026-02-14-mongodb-schema-decision.md
Normal file
183
thoughts/research/2026-02-14-mongodb-schema-decision.md
Normal file
|
|
@ -0,0 +1,183 @@
|
|||
# MongoDB Schema Design Decision Summary
|
||||
|
||||
**Date**: 2026-02-14
|
||||
**Decision**: **Zero-Knowledge Encryption for All Sensitive Data + Metadata**
|
||||
|
||||
---
|
||||
|
||||
## Core Principle
|
||||
|
||||
**ALL sensitive data AND metadata must be encrypted client-side before reaching MongoDB.**
|
||||
|
||||
### Example: Blood Pressure Reading
|
||||
|
||||
**Before encryption** (client-side):
|
||||
```javascript
|
||||
{
|
||||
value: "120/80",
|
||||
type: "blood_pressure",
|
||||
unit: "mmHg",
|
||||
date: "2026-02-14T10:30:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
**After encryption** (stored in MongoDB):
|
||||
```javascript
|
||||
{
|
||||
healthDataId: "health-123",
|
||||
userId: "user-456",
|
||||
profileId: "profile-789",
|
||||
familyId: "family-012",
|
||||
|
||||
// Encrypted (value + metadata)
|
||||
healthData: [
|
||||
{
|
||||
encrypted: true,
|
||||
data: "a1b2c3d4...",
|
||||
iv: "e5f6g7h8...",
|
||||
authTag: "i9j0k1l2..."
|
||||
}
|
||||
],
|
||||
|
||||
// Metadata (plaintext)
|
||||
createdAt: ISODate("2026-02-14T10:30:00Z"),
|
||||
updatedAt: ISODate("2026-02-14T10:30:00Z"),
|
||||
dataSource: "healthKit"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Collections Summary
|
||||
|
||||
| Collection | Purpose | Encrypted Fields | Plaintext Fields |
|
||||
|-----------|---------|------------------|-----------------|
|
||||
| **users** | Authentication | encryptedRecoveryPhrase | userId, email, passwordHash, tokenVersion, familyId, familyRole, permissions |
|
||||
| **families** | Family structure | familyName, familyMetadata | familyId, members[*].userId, members[*].profileId, members[*].role |
|
||||
| **profiles** | Person profiles | profileName, profileMetadata | profileId, userId, familyId, profileType |
|
||||
| **health_data** | Health records | healthData[*] (value + metadata) | healthDataId, userId, profileId, familyId, createdAt, updatedAt, dataSource |
|
||||
| **lab_results** | Lab tests | labData (value + metadata), labMetadata | labResultId, userId, profileId, familyId, createdAt, updatedAt, dataSource |
|
||||
| **medications** | Medication tracking | medicationData (value + metadata), reminderSchedule | medicationId, userId, profileId, familyId, active, createdAt, updatedAt |
|
||||
| **appointments** | Medical appointments | appointmentData (value + metadata), reminderSettings | appointmentId, userId, profileId, familyId, createdAt, updatedAt |
|
||||
| **shares** | Shared data | encryptedData (share-specific password) | shareId, userId, documentId, collectionName, createdAt, expiresAt, accessCount, isRevoked |
|
||||
| **refresh_tokens** | JWT tokens | None | jti, userId, createdAt, expiresAt, revoked |
|
||||
|
||||
---
|
||||
|
||||
## Encryption Strategy
|
||||
|
||||
### Client-Side Encryption
|
||||
|
||||
**Encryption Flow**:
|
||||
1. User enters health data
|
||||
2. Client derives encryption key from password (PBKDF2)
|
||||
3. Client encrypts health data (AES-256-GCM)
|
||||
4. Client sends encrypted data to server
|
||||
5. Server stores encrypted data in MongoDB
|
||||
6. Server NEVER decrypts data
|
||||
|
||||
### What Must Be Encrypted
|
||||
- ✅ **Health data values** (e.g., "120/80")
|
||||
- ✅ **Health data metadata** (e.g., "blood_pressure", "mmHg")
|
||||
- ✅ **Lab test results** (e.g., "cholesterol", "200", "LabCorp")
|
||||
- ✅ **Medication data** (e.g., "Aspirin", "100mg", "daily")
|
||||
- ✅ **Appointment data** (e.g., "checkup", "Dr. Smith")
|
||||
- ✅ **Profile data** (e.g., "John Doe", "1990-01-01")
|
||||
- ✅ **Family data** (e.g., "Smith Family", "123 Main St")
|
||||
### What Can Be Plaintext
|
||||
- ✅ **User IDs** (userId, profileId, familyId) - for queries
|
||||
- ✅ **Email addresses** - for authentication
|
||||
- ✅ **Dates** (createdAt, updatedAt) - for sorting
|
||||
- ✅ **Data sources** (healthKit, googleFit) - for analytics
|
||||
- ✅ **Tags** (cardio, daily) - for client-side search
|
||||
|
||||
---
|
||||
|
||||
## Privacy-Preserving Queries
|
||||
|
||||
### 1. Plaintext Queries (Recommended)
|
||||
|
||||
**Query by plaintext fields only**:
|
||||
```javascript
|
||||
const healthData = await db.health_data.find({
|
||||
userId: 'user-123', // Plaintext ✅
|
||||
profileId: 'profile-456', // Plaintext ✅
|
||||
familyId: 'family-789' // Plaintext ✅
|
||||
}).toArray();
|
||||
|
||||
// Client decrypts healthData[i].healthData[j]
|
||||
```
|
||||
|
||||
### 2. Tagging System (Encrypted Search)
|
||||
|
||||
**Client adds searchable tags to encrypted data**:
|
||||
```javascript
|
||||
const healthData = await db.health_data.find({
|
||||
userId: 'user-123',
|
||||
tags: { $in: ['cardio', 'daily'] } // Plaintext tags ✅
|
||||
}).toArray();
|
||||
|
||||
// Client decrypts healthData[i].healthData[j]
|
||||
```
|
||||
|
||||
### 3. Date Range Queries (Plaintext Dates)
|
||||
|
||||
**Store dates as plaintext** (for range queries):
|
||||
```javascript
|
||||
const healthData = await db.health_data.find({
|
||||
userId: 'user-123',
|
||||
date: {
|
||||
$gte: ISODate("2026-02-01"),
|
||||
$lte: ISODate("2026-02-28")
|
||||
}
|
||||
}).toArray();
|
||||
|
||||
// Client decrypts healthData[i].healthData[j]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Technology Stack
|
||||
|
||||
### Backend (Axum + MongoDB)
|
||||
- **Axum 0.7.x**: Web framework
|
||||
- **MongoDB 6.0+**: Database
|
||||
- **Rust**: Server language
|
||||
### Client (React Native + React)
|
||||
- **AES-256-GCM**: Encryption algorithm
|
||||
- **PBKDF2**: Key derivation function
|
||||
- **Crypto API**: Node.js crypto / react-native-quick-crypto
|
||||
|
||||
---
|
||||
|
||||
## Implementation Timeline
|
||||
|
||||
- **Week 1**: Create MongoDB indexes
|
||||
- **Week 1-2**: Implement client-side encryption (React Native + React)
|
||||
- **Week 2-3**: Implement server-side API (Axum + MongoDB)
|
||||
- **Week 3**: Test encryption flow
|
||||
- **Week 3-4**: Test data migration (key rotation)
|
||||
- **Week 4**: Test privacy-preserving queries
|
||||
- **Week 4-5**: Performance testing
|
||||
**Total**: 4-5 weeks
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Create MongoDB indexes for all collections
|
||||
2. Implement client-side encryption (React Native + React)
|
||||
3. Implement server-side API (Axum + MongoDB)
|
||||
4. Test encryption flow (end-to-end)
|
||||
5. Test data migration (key rotation)
|
||||
6. Test privacy-preserving queries
|
||||
7. Performance testing
|
||||
8. Create API documentation
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Comprehensive MongoDB Schema Research](./2026-02-14-mongodb-schema-design-research.md)
|
||||
- [Normogen Encryption Guide](../encryption.md)
|
||||
- [JWT Authentication Research](./2026-02-14-jwt-authentication-research.md)
|
||||
- [Technology Stack Decisions](./2026-02-14-tech-stack-decision.md)
|
||||
Loading…
Add table
Add a link
Reference in a new issue