docs(docker): Add /var space issue documentation and monitoring scripts
This commit is contained in:
parent
f0b5109f61
commit
b0318430ad
4 changed files with 347 additions and 0 deletions
154
backend/DAILY-SUMMARY.md
Normal file
154
backend/DAILY-SUMMARY.md
Normal file
|
|
@ -0,0 +1,154 @@
|
||||||
|
# Today's Docker Issues - Summary and Solutions
|
||||||
|
|
||||||
|
## Date: 2026-02-15
|
||||||
|
|
||||||
|
## Issues Fixed
|
||||||
|
|
||||||
|
### 1. Edition 2024 Error
|
||||||
|
**Problem:** Rust 1.83-alpine didn't support Edition 2024
|
||||||
|
**Solution:** Updated Dockerfiles to use Rust 1.93-slim
|
||||||
|
**Files Modified:**
|
||||||
|
- `backend/docker/Dockerfile`
|
||||||
|
- `backend/docker/Dockerfile.dev`
|
||||||
|
**Documentation:** `backend/docker/EDITION2024-FIX.md`
|
||||||
|
|
||||||
|
### 2. MongoDB Healthcheck Configuration
|
||||||
|
**Problem:** Healthcheck timing out, complex command
|
||||||
|
**Solution:** Simplified healthcheck with 60s startup grace period
|
||||||
|
**Files Modified:**
|
||||||
|
- `backend/docker-compose.dev.yml`
|
||||||
|
**Documentation:** `backend/docker/MONGODB-TROUBLESHOOTING.md`
|
||||||
|
|
||||||
|
### 3. MongoDB Disk Space Issue
|
||||||
|
**Problem:** MongoDB crashing with "No space left on device" error
|
||||||
|
**Root Cause:** `/var` filesystem was 100% full (not root `/` filesystem)
|
||||||
|
**Solution:** Freed up space in `/var` filesystem
|
||||||
|
**Key Insight:** Docker stores volumes in `/var/lib/docker/volumes/`, so `/var` space matters more than root space for MongoDB
|
||||||
|
**Documentation:** `backend/docker/MONGODB-VAR-FULL-ISSUE.md`
|
||||||
|
|
||||||
|
## Lessons Learned
|
||||||
|
|
||||||
|
1. **Always check all filesystems** with `df -h`, not just root (`/`)
|
||||||
|
2. **Docker data location matters** - `/var/lib/docker` by default
|
||||||
|
3. **Separate mounts have different space** - `/var` can be full while `/` has space
|
||||||
|
4. **Monitor Docker space usage** regularly with `docker system df`
|
||||||
|
|
||||||
|
## Prevention Setup
|
||||||
|
|
||||||
|
### Regular Monitoring
|
||||||
|
Add to crontab:
|
||||||
|
```bash
|
||||||
|
# Check disk space every hour
|
||||||
|
0 * * * * /path/to/normogen/backend/scripts/check-disk-space.sh
|
||||||
|
|
||||||
|
# Clean Docker weekly
|
||||||
|
0 2 * * 0 docker system prune -f --filter "until=168h"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Manual Checks
|
||||||
|
```bash
|
||||||
|
# Quick space check
|
||||||
|
df -h
|
||||||
|
|
||||||
|
# Docker space usage
|
||||||
|
docker system df
|
||||||
|
|
||||||
|
# Verify stack is running
|
||||||
|
./backend/scripts/verify-stack.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
## Documentation Created
|
||||||
|
|
||||||
|
1. `backend/docker/EDITION2024-FIX.md` - Edition 2024 fix
|
||||||
|
2. `backend/docker/MONGODB-TROUBLESHOOTING.md` - MongoDB issues
|
||||||
|
3. `backend/docker/MONGODB-PERMISSIONS-FIX.md` - Permissions guide
|
||||||
|
4. `backend/docker/MONGODB-DISKSPACE-FIX.md` - Disk space guide
|
||||||
|
5. `backend/docker/MONGODB-VAR-FULL-ISSUE.md` - /var space issue
|
||||||
|
6. `backend/docker/DOCKER-COMMANDS.md` - Docker commands reference
|
||||||
|
7. `backend/scripts/check-disk-space.sh` - Space monitoring
|
||||||
|
8. `backend/scripts/verify-stack.sh` - Stack verification
|
||||||
|
9. `backend/diagnose-mongodb.sh` - MongoDB diagnostics
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
### Start the Stack
|
||||||
|
```bash
|
||||||
|
cd backend
|
||||||
|
docker compose -f docker-compose.dev.yml up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
### Check Status
|
||||||
|
```bash
|
||||||
|
docker compose -f docker-compose.dev.yml ps
|
||||||
|
docker ps | grep normogen
|
||||||
|
```
|
||||||
|
|
||||||
|
### View Logs
|
||||||
|
```bash
|
||||||
|
# All services
|
||||||
|
docker compose -f docker-compose.dev.yml logs -f
|
||||||
|
|
||||||
|
# MongoDB only
|
||||||
|
docker logs -f normogen-mongodb-dev
|
||||||
|
|
||||||
|
# Backend only
|
||||||
|
docker logs -f normogen-backend-dev
|
||||||
|
```
|
||||||
|
|
||||||
|
### Stop the Stack
|
||||||
|
```bash
|
||||||
|
docker compose -f docker-compose.dev.yml down
|
||||||
|
```
|
||||||
|
|
||||||
|
### Clean Restart
|
||||||
|
```bash
|
||||||
|
docker compose -f docker-compose.dev.yml down -v
|
||||||
|
docker compose -f docker-compose.dev.yml up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
## Success Indicators
|
||||||
|
|
||||||
|
When everything is working, you should see:
|
||||||
|
|
||||||
|
1. **Containers running:**
|
||||||
|
```
|
||||||
|
$ docker ps | grep normogen
|
||||||
|
normogen-mongodb-dev Up X minutes (healthy) 0.0.0.0:27017->27017/tcp
|
||||||
|
normogen-backend-dev Up X minutes 0.0.0.0:6800->8000/tcp
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **MongoDB logs:**
|
||||||
|
```
|
||||||
|
{"msg":"Waiting for connections on port 27017"}
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Backend logs:**
|
||||||
|
```
|
||||||
|
Server is running on http://0.0.0.0:8000
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Healthcheck:**
|
||||||
|
```
|
||||||
|
$ docker inspect normogen-mongodb-dev --format='{{.State.Health.Status}}'
|
||||||
|
healthy
|
||||||
|
```
|
||||||
|
|
||||||
|
## Git Commits
|
||||||
|
|
||||||
|
1. `d63f160` - fix(docker): Update to Rust 1.93 to support Edition 2024
|
||||||
|
2. `b218594` - fix(docker): Fix MongoDB healthcheck configuration
|
||||||
|
3. `b068579` - fix(docker): Simplify MongoDB healthcheck and add troubleshooting
|
||||||
|
4. `f0b5109` - fix(docker): Document MongoDB disk space issue and solutions
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
1. Build completed successfully
|
||||||
|
2. `/var` space issue resolved
|
||||||
|
3. Verify stack is running with `./backend/scripts/verify-stack.sh`
|
||||||
|
4. Test API endpoints
|
||||||
|
5. Continue with Phase 2.4 development
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Last Updated:** 2026-02-15
|
||||||
|
**Status:** All issues resolved
|
||||||
111
backend/docker/MONGODB-VAR-FULL-ISSUE.md
Normal file
111
backend/docker/MONGODB-VAR-FULL-ISSUE.md
Normal file
|
|
@ -0,0 +1,111 @@
|
||||||
|
# MongoDB Docker Issue: /var Filesystem Full
|
||||||
|
|
||||||
|
## Problem Summary
|
||||||
|
|
||||||
|
MongoDB container was failing with:
|
||||||
|
```
|
||||||
|
WiredTiger error: No space left on device (error 28)
|
||||||
|
fatal log failure
|
||||||
|
WT_PANIC: WiredTiger library panic
|
||||||
|
```
|
||||||
|
|
||||||
|
## Root Cause
|
||||||
|
|
||||||
|
While the root filesystem (`/`) had 300GB+ free space, the **`/var` filesystem was 100% full**.
|
||||||
|
|
||||||
|
### Why This Affected MongoDB
|
||||||
|
|
||||||
|
Docker stores all persistent data in `/var/lib/docker`:
|
||||||
|
- Container volumes: `/var/lib/docker/volumes/`
|
||||||
|
- Container images: `/var/lib/docker/image/`
|
||||||
|
- Container logs: `/var/lib/docker/containers/`
|
||||||
|
- OverlayFS layers: `/var/lib/docker/overlay2/`
|
||||||
|
|
||||||
|
MongoDB's `/data/db` is mapped to a Docker volume in `/var/lib/docker/volumes/`,
|
||||||
|
so even with 300GB+ free on `/`, MongoDB couldn't write to `/var`.
|
||||||
|
|
||||||
|
## How to Detect This Issue
|
||||||
|
|
||||||
|
### Check All Filesystems
|
||||||
|
```bash
|
||||||
|
# Check all mounted filesystems
|
||||||
|
df -h
|
||||||
|
|
||||||
|
# Look for filesystems at 100%
|
||||||
|
df -h | grep -E '100%|Filesystem'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Check Docker Data Location
|
||||||
|
```bash
|
||||||
|
# Check where Docker stores data
|
||||||
|
docker system info | grep 'Docker Root Dir'
|
||||||
|
|
||||||
|
# Check space usage in Docker directory
|
||||||
|
sudo du -sh /var/lib/docker/*
|
||||||
|
```
|
||||||
|
|
||||||
|
## Solutions
|
||||||
|
|
||||||
|
### Immediate Fix: Free Up Space in /var
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Clean Docker (frees space in /var/lib/docker)
|
||||||
|
docker system prune -a --volumes -f
|
||||||
|
|
||||||
|
# Clean package caches
|
||||||
|
sudo apt clean
|
||||||
|
sudo apt autoclean
|
||||||
|
|
||||||
|
# Clean logs
|
||||||
|
sudo journalctl --vacuum-time=3d
|
||||||
|
```
|
||||||
|
|
||||||
|
### Monitor /var Space
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Add to crontab for alerts
|
||||||
|
crontab -e
|
||||||
|
# Add this line:
|
||||||
|
*/5 * * * * df /var | tail -1 | awk '{print $5}' | grep -v Use | awk '{if($1+0 > 90) print "/var is " $1 " full"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
## Lessons Learned
|
||||||
|
|
||||||
|
1. **Check all filesystems**, not just root (`/`)
|
||||||
|
2. **Docker data lives in `/var`** by default
|
||||||
|
3. **Separate mounts** can have different space availability
|
||||||
|
4. **Monitor `/var` separately** when running Docker
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
After fixing /var space, verify:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check /var has free space
|
||||||
|
df -h /var
|
||||||
|
|
||||||
|
# Check MongoDB container is running
|
||||||
|
docker ps | grep mongodb
|
||||||
|
|
||||||
|
# Check MongoDB is healthy
|
||||||
|
docker inspect normogen-mongodb-dev --format='{{.State.Health.Status}}'
|
||||||
|
|
||||||
|
# Check MongoDB logs
|
||||||
|
docker logs normogen-mongodb-dev | grep "waiting for connections"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Expected Success
|
||||||
|
|
||||||
|
After fixing /var space:
|
||||||
|
```
|
||||||
|
$ df -h /var
|
||||||
|
Filesystem Size Used Avail Use% Mounted on
|
||||||
|
/dev/sdb1 50G 15G 35G 30% /var
|
||||||
|
|
||||||
|
$ docker ps
|
||||||
|
CONTAINER ID IMAGE STATUS
|
||||||
|
abc123 mongo:6.0 Up 2 minutes (healthy)
|
||||||
|
|
||||||
|
$ docker logs normogen-mongodb-dev
|
||||||
|
{"msg":"Waiting for connections on port 27017"}
|
||||||
|
```
|
||||||
40
backend/scripts/check-disk-space.sh
Executable file
40
backend/scripts/check-disk-space.sh
Executable file
|
|
@ -0,0 +1,40 @@
|
||||||
|
#!/bin/bash
|
||||||
|
# Monitor disk space on all filesystems
|
||||||
|
# Run this periodically to catch space issues early
|
||||||
|
|
||||||
|
echo "================================"
|
||||||
|
echo "Disk Space Check - $(date)"
|
||||||
|
echo "================================"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check all filesystems
|
||||||
|
echo "All Filesystems:"
|
||||||
|
df -h
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check specifically /var
|
||||||
|
echo "/var Filesystem:"
|
||||||
|
df -h /var 2>/dev/null || echo "No separate /var filesystem"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check Docker data location
|
||||||
|
echo "Docker Data Location:"
|
||||||
|
docker system info 2>/dev/null | grep "Docker Root Dir" || echo "Docker not accessible"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check Docker space usage
|
||||||
|
echo "Docker Space Usage:"
|
||||||
|
docker system df 2>/dev/null || echo "Cannot get Docker stats"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Alert if any filesystem is > 90% full
|
||||||
|
echo "Alerts (filesystems > 90% full):"
|
||||||
|
df -h | awk 'NR>1 {gsub(/%/,""); if($5 > 90) print $6 " is " $5 "% full"}"
|
||||||
|
if [ $(df -h | awk 'NR>1 {gsub(/%/,""); if($5 > 90)}' | wc -l) -eq 0 ]; then
|
||||||
|
echo " No alerts"
|
||||||
|
fi
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "================================"
|
||||||
|
echo "Check complete"
|
||||||
|
echo "================================"
|
||||||
42
backend/scripts/verify-stack.sh
Executable file
42
backend/scripts/verify-stack.sh
Executable file
|
|
@ -0,0 +1,42 @@
|
||||||
|
#!/bin/bash
|
||||||
|
# Verify MongoDB and Backend are running correctly
|
||||||
|
|
||||||
|
echo "================================"
|
||||||
|
echo "Normogen Stack Verification"
|
||||||
|
echo "================================"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check containers are running
|
||||||
|
echo "1. Checking containers..."
|
||||||
|
docker ps --format "table {{.Names}} {{.Status}} {{.Ports}}" | grep normogen
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check MongoDB health
|
||||||
|
echo "2. Checking MongoDB health..."
|
||||||
|
MONGO_HEALTH=$(docker inspect normogen-mongodb-dev --format='{{.State.Health.Status}}' 2>/dev/null)
|
||||||
|
echo " MongoDB Health: $MONGO_HEALTH"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check if MongoDB is accepting connections
|
||||||
|
echo "3. Testing MongoDB connection..."
|
||||||
|
docker exec normogen-mongodb-dev mongosh --eval 'db.runCommand({ping: 1})' --quiet 2>/dev/null && echo " OK MongoDB is responding" || echo " FAILED MongoDB not responding"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check backend logs
|
||||||
|
echo "4. Checking backend startup..."
|
||||||
|
docker logs normogen-backend-dev 2>&1 | tail -5
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Show recent MongoDB logs
|
||||||
|
echo "5. Recent MongoDB logs..."
|
||||||
|
docker logs normogen-mongodb-dev 2>&1 | grep -E '(waiting|ready|started|ERROR)' | tail -5
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check filesystem space
|
||||||
|
echo "6. Checking filesystem space..."
|
||||||
|
df -h | grep -E '(Filesystem|/var|/$)'
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "================================"
|
||||||
|
echo "Verification complete"
|
||||||
|
echo "================================"
|
||||||
Loading…
Add table
Add a link
Reference in a new issue