chore: Clean up temporary docs and start Phase 2.4
- Remove 28+ temporary debugging documentation files - Remove temporary test scripts and log files - Keep only essential files (quick-test.sh, EDITION2024-FIX.md) - Create PHASE-2.4-SPEC.md with complete feature specifications - Update STATUS.md with current progress and recent issues - Ready to begin Phase 2.4 implementation
This commit is contained in:
parent
26f0df58ef
commit
51b7d75dca
14 changed files with 245 additions and 987 deletions
|
|
@ -1,136 +0,0 @@
|
|||
# MongoDB Disk Space Issues - RESOLVED
|
||||
|
||||
## Problem Identified
|
||||
|
||||
MongoDB container was crashing with:
|
||||
```
|
||||
WiredTiger error: No space left on device (error 28)
|
||||
fatal log failure
|
||||
WT_PANIC: WiredTiger library panic
|
||||
```
|
||||
|
||||
## Root Cause
|
||||
|
||||
The server's disk is **71% full** (608G used of 906G), and MongoDB's WiredTiger
|
||||
storage engine cannot write to its journal files at `/data/db/journal/`.
|
||||
|
||||
## Immediate Solutions
|
||||
|
||||
### Solution 1: Free Up Disk Space (Recommended)
|
||||
|
||||
```bash
|
||||
# Check disk usage
|
||||
df -h
|
||||
|
||||
# Check what's using space
|
||||
sudo du -sh /var/* 2>/dev/null | sort -rh | head -20
|
||||
|
||||
# Clean Docker system (frees significant space!)
|
||||
docker system prune -a --volumes -f
|
||||
|
||||
# Or more conservatively (without volumes):
|
||||
docker system prune -a -f
|
||||
|
||||
# Clean only unused volumes
|
||||
docker volume prune -f
|
||||
```
|
||||
|
||||
### Solution 2: Clean Docker Before Starting MongoDB
|
||||
|
||||
```bash
|
||||
# Stop all containers
|
||||
docker compose -f backend/docker-compose.dev.yml down
|
||||
|
||||
# Clean up
|
||||
docker system prune -f
|
||||
docker volume prune -f
|
||||
|
||||
# Restart
|
||||
docker compose -f backend/docker-compose.dev.yml up -d
|
||||
```
|
||||
|
||||
### Solution 3: Use Alternative Volume Location
|
||||
|
||||
If you have another partition with more space:
|
||||
|
||||
`docker-compose.dev.yml`:
|
||||
```yaml
|
||||
volumes:
|
||||
- /path/to/larger/partition/mongodb:/data/db
|
||||
```
|
||||
|
||||
## How MongoDB Uses Disk Space
|
||||
|
||||
MongoDB requires disk space for:
|
||||
1. **Data files**: The actual database data
|
||||
2. **Journal files**: Write-ahead logs (typically 1-3GB)
|
||||
3. **WiredTiger cache**: Configured to use 7.3GB in your setup
|
||||
4. **Oplog**: Operations log (for replication)
|
||||
|
||||
Minimum free space recommended: **At least 20% free disk space**
|
||||
|
||||
## Prevention
|
||||
|
||||
### Monitor Disk Space
|
||||
|
||||
```bash
|
||||
# Add to crontab for alerts
|
||||
df -h | awk '{print $5 " " $6}' | grep -vE 'Use|Mounted|none|tmpfs' | while read output;
|
||||
do
|
||||
usep=$(echo $output | awk '{print $1}' | cut -d'%' -f1)
|
||||
partition=$(echo $output | awk '{print $2}')
|
||||
if [ $usep -ge 80 ]; then
|
||||
echo "Running out of space on $partition ($usep%)"
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
### Configure MongoDB Storage Limits
|
||||
|
||||
In production, configure MongoDB with storage limits:
|
||||
|
||||
```yaml
|
||||
environment:
|
||||
- MONGO_INITDB_ROOT_USERNAME=admin
|
||||
- MONGO_INITDB_ROOT_PASSWORD=password
|
||||
- WIRED_TIGER_CONFIG="cache_size=2G" # Reduce from 7.3G
|
||||
```
|
||||
|
||||
## Steps to Recover
|
||||
|
||||
1. **Stop containers**:
|
||||
```bash
|
||||
docker compose -f backend/docker-compose.dev.yml down -v
|
||||
```
|
||||
|
||||
2. **Free disk space** (choose one):
|
||||
- `docker system prune -a --volumes -f` (removes all unused Docker data)
|
||||
- Remove old logs, backups, or unnecessary files
|
||||
|
||||
3. **Verify space**:
|
||||
```bash
|
||||
df -h
|
||||
```
|
||||
|
||||
4. **Start fresh**:
|
||||
```bash
|
||||
docker compose -f backend/docker-compose.dev.yml up -d
|
||||
docker compose -f backend/docker-compose.dev.yml logs -f mongodb
|
||||
```
|
||||
|
||||
5. **Verify MongoDB started**:
|
||||
Look for "waiting for connections on port 27017" in the logs
|
||||
|
||||
## Current Docker Compose Configuration
|
||||
|
||||
The updated `docker-compose.dev.yml` includes:
|
||||
- ✅ Simplified healthcheck
|
||||
- ✅ 60s startup grace period
|
||||
- ✅ Commented alternative volume mount options
|
||||
- ✅ Proper dependency management
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [MongoDB Production Notes](https://www.mongodb.com/docs/manual/administration/production-notes/)
|
||||
- [WiredTiger Storage](https://www.mongodb.com/docs/manual/core/wiredtiger/)
|
||||
- [Docker Storage](https://docs.docker.com/storage/)
|
||||
|
|
@ -1,137 +0,0 @@
|
|||
# MongoDB Health Check Troubleshooting
|
||||
|
||||
## Problem
|
||||
MongoDB container failing health checks despite running properly.
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### Issue 1: Complex Healthcheck Command
|
||||
The original healthcheck used shell script format which can be problematic:
|
||||
```yaml
|
||||
test: |
|
||||
mongosh --eval "db.adminCommand('ping').ok" --quiet
|
||||
```
|
||||
|
||||
### Issue 2: Insufficient Startup Time
|
||||
Even with 40s start_period, MongoDB may need more time on:
|
||||
- First run (data initialization)
|
||||
- Slower systems
|
||||
- Systems with high I/O wait
|
||||
|
||||
### Issue 3: Log Format Issues
|
||||
The logs show extreme verbosity and duplication, suggesting the container is outputting logs in an unusual format.
|
||||
|
||||
## Solution: Simplified Healthcheck
|
||||
|
||||
### Updated Configuration
|
||||
```yaml
|
||||
healthcheck:
|
||||
test: echo 'db.runCommand("ping").ok' | mongosh localhost:27017/test --quiet
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
start_period: 60s # Increased from 40s to 60s
|
||||
```
|
||||
|
||||
### Key Changes
|
||||
1. **Piped command instead of --eval**: More reliable with mongosh
|
||||
2. **Explicit localhost**: Avoids DNS resolution issues
|
||||
3. **Simple test database**: Uses /test instead of admin
|
||||
4. **Longer start_period**: 60s gives MongoDB plenty of time
|
||||
|
||||
## Alternative: Disable Healthcheck for Development
|
||||
|
||||
If healthchecks continue to cause issues, you can disable them for development:
|
||||
|
||||
```yaml
|
||||
healthcheck:
|
||||
disable: true
|
||||
```
|
||||
|
||||
Or remove the healthcheck entirely and use a simple dependency:
|
||||
|
||||
```yaml
|
||||
depends_on:
|
||||
- mongodb
|
||||
# Remove: condition: service_healthy
|
||||
```
|
||||
|
||||
## How to Apply
|
||||
|
||||
### Option 1: Pull and Restart (Recommended)
|
||||
```bash
|
||||
git pull origin main
|
||||
docker compose -f docker-compose.dev.yml down -v
|
||||
docker compose -f docker-compose.dev.yml up -d
|
||||
docker compose -f docker-compose.dev.yml logs -f mongodb
|
||||
```
|
||||
|
||||
### Option 2: Disable Healthcheck (Quick Fix)
|
||||
Edit `docker-compose.dev.yml` and replace the healthcheck section with:
|
||||
```yaml
|
||||
healthcheck:
|
||||
disable: true
|
||||
```
|
||||
|
||||
Then restart:
|
||||
```bash
|
||||
docker compose -f docker-compose.dev.yml down -v
|
||||
docker compose -f docker-compose.dev.yml up -d
|
||||
```
|
||||
|
||||
## Verification
|
||||
|
||||
### Check Container Status
|
||||
```bash
|
||||
docker ps --format "table {{.Names}} {{.Status}}"
|
||||
```
|
||||
|
||||
### Check MongoDB Connection
|
||||
```bash
|
||||
docker exec normogen-mongodb-dev mongosh --eval "db.adminCommand('ping')"
|
||||
```
|
||||
|
||||
### Check Health Status
|
||||
```bash
|
||||
docker inspect normogen-mongodb-dev --format='{{json .State.Health}}' | jq
|
||||
```
|
||||
|
||||
## Common Issues and Fixes
|
||||
|
||||
### Issue: Port Already in Use
|
||||
```bash
|
||||
# Check what's using port 27017
|
||||
sudo lsof -i :27017
|
||||
|
||||
# Kill the process if needed
|
||||
sudo kill -9 <PID>
|
||||
```
|
||||
|
||||
### Issue: Corrupted Volume
|
||||
```bash
|
||||
# Remove the volume and start fresh
|
||||
docker compose -f docker-compose.dev.yml down -v
|
||||
docker compose -f docker-compose.dev.yml up -d
|
||||
```
|
||||
|
||||
### Issue: mongosh Not Found
|
||||
This shouldn't happen with mongo:6.0, but if it does:
|
||||
```bash
|
||||
# Verify mongosh exists
|
||||
docker exec normogen-mongodb-dev which mongosh
|
||||
|
||||
# If not found, try using mongo (legacy)
|
||||
docker exec normogen-mongodb-dev which mongo
|
||||
```
|
||||
|
||||
## Development vs Production
|
||||
|
||||
### Development (Current)
|
||||
- Healthcheck enabled but with generous timeouts
|
||||
- Focus on getting up and running quickly
|
||||
- Can disable healthcheck if causing issues
|
||||
|
||||
### Production
|
||||
- Healthcheck is critical
|
||||
- Must use proper healthcheck with monitoring
|
||||
- Consider using orchestration tools (Kubernetes, etc.)
|
||||
|
|
@ -1,111 +0,0 @@
|
|||
# MongoDB Docker Issue: /var Filesystem Full
|
||||
|
||||
## Problem Summary
|
||||
|
||||
MongoDB container was failing with:
|
||||
```
|
||||
WiredTiger error: No space left on device (error 28)
|
||||
fatal log failure
|
||||
WT_PANIC: WiredTiger library panic
|
||||
```
|
||||
|
||||
## Root Cause
|
||||
|
||||
While the root filesystem (`/`) had 300GB+ free space, the **`/var` filesystem was 100% full**.
|
||||
|
||||
### Why This Affected MongoDB
|
||||
|
||||
Docker stores all persistent data in `/var/lib/docker`:
|
||||
- Container volumes: `/var/lib/docker/volumes/`
|
||||
- Container images: `/var/lib/docker/image/`
|
||||
- Container logs: `/var/lib/docker/containers/`
|
||||
- OverlayFS layers: `/var/lib/docker/overlay2/`
|
||||
|
||||
MongoDB's `/data/db` is mapped to a Docker volume in `/var/lib/docker/volumes/`,
|
||||
so even with 300GB+ free on `/`, MongoDB couldn't write to `/var`.
|
||||
|
||||
## How to Detect This Issue
|
||||
|
||||
### Check All Filesystems
|
||||
```bash
|
||||
# Check all mounted filesystems
|
||||
df -h
|
||||
|
||||
# Look for filesystems at 100%
|
||||
df -h | grep -E '100%|Filesystem'
|
||||
```
|
||||
|
||||
### Check Docker Data Location
|
||||
```bash
|
||||
# Check where Docker stores data
|
||||
docker system info | grep 'Docker Root Dir'
|
||||
|
||||
# Check space usage in Docker directory
|
||||
sudo du -sh /var/lib/docker/*
|
||||
```
|
||||
|
||||
## Solutions
|
||||
|
||||
### Immediate Fix: Free Up Space in /var
|
||||
|
||||
```bash
|
||||
# Clean Docker (frees space in /var/lib/docker)
|
||||
docker system prune -a --volumes -f
|
||||
|
||||
# Clean package caches
|
||||
sudo apt clean
|
||||
sudo apt autoclean
|
||||
|
||||
# Clean logs
|
||||
sudo journalctl --vacuum-time=3d
|
||||
```
|
||||
|
||||
### Monitor /var Space
|
||||
|
||||
```bash
|
||||
# Add to crontab for alerts
|
||||
crontab -e
|
||||
# Add this line:
|
||||
*/5 * * * * df /var | tail -1 | awk '{print $5}' | grep -v Use | awk '{if($1+0 > 90) print "/var is " $1 " full"}'
|
||||
```
|
||||
|
||||
## Lessons Learned
|
||||
|
||||
1. **Check all filesystems**, not just root (`/`)
|
||||
2. **Docker data lives in `/var`** by default
|
||||
3. **Separate mounts** can have different space availability
|
||||
4. **Monitor `/var` separately** when running Docker
|
||||
|
||||
## Verification
|
||||
|
||||
After fixing /var space, verify:
|
||||
|
||||
```bash
|
||||
# Check /var has free space
|
||||
df -h /var
|
||||
|
||||
# Check MongoDB container is running
|
||||
docker ps | grep mongodb
|
||||
|
||||
# Check MongoDB is healthy
|
||||
docker inspect normogen-mongodb-dev --format='{{.State.Health.Status}}'
|
||||
|
||||
# Check MongoDB logs
|
||||
docker logs normogen-mongodb-dev | grep "waiting for connections"
|
||||
```
|
||||
|
||||
## Expected Success
|
||||
|
||||
After fixing /var space:
|
||||
```
|
||||
$ df -h /var
|
||||
Filesystem Size Used Avail Use% Mounted on
|
||||
/dev/sdb1 50G 15G 35G 30% /var
|
||||
|
||||
$ docker ps
|
||||
CONTAINER ID IMAGE STATUS
|
||||
abc123 mongo:6.0 Up 2 minutes (healthy)
|
||||
|
||||
$ docker logs normogen-mongodb-dev
|
||||
{"msg":"Waiting for connections on port 27017"}
|
||||
```
|
||||
Loading…
Add table
Add a link
Reference in a new issue