Disk fills / database grows unbounded
Symptom
Section titled “Symptom”The box runs low on disk, and the observability database file is enormous relative to how much data it should hold — for example tens of GB on disk while only ~1 GB of predictions are actually retained. Deleting old rows (or shortening retention) does not shrink the file.
The cause: old rows are deleted on schedule, but on a database created without incremental auto-vacuum the freed pages are kept in an internal free-list and never returned to the operating system. The file stays at its high-water mark forever.
Confirm
Section titled “Confirm”Check the database file size inside the backend container against the expected live size:
docker compose -f docker-compose.release.yml exec backend \ sh -c 'ls -lh /data/aiboard.db'If the file is many times larger than the volume of data your retention window should hold (a multi-GB file for a few days of predictions), you have free-list bloat. A single high-rate write burst can inflate the file far past steady-state size.
Current releases reclaim space automatically after each retention sweep, so a healthy box self-corrects. A database that bloated before that behavior was in place needs a one-time reclaim.
-
Stop the backend so the database is not being written during the reclaim:
Terminal window docker compose -f docker-compose.release.yml stop backend -
Run the one-time reclaim against the database file. This converts the file to incremental auto-vacuum and compacts it, returning the free-list pages to the OS:
Terminal window docker compose -f docker-compose.release.yml run --rm --entrypoint sh backend -c \'sqlite3 /data/aiboard.db "PRAGMA auto_vacuum=INCREMENTAL; VACUUM;"' -
Verify the file shrank and the data is intact:
Terminal window docker compose -f docker-compose.release.yml run --rm --entrypoint sh backend -c \'ls -lh /data/aiboard.db; sqlite3 /data/aiboard.db "PRAGMA integrity_check;"'# expect a much smaller file and: ok -
Start the backend again:
Terminal window docker compose -f docker-compose.release.yml start backend
Prevent
Section titled “Prevent”- Keep retention bounded. The retention window (
InferenceObservability:RetentionDays, default 3 days) deletes old prediction rows on a schedule; the post-delete reclaim returns the freed pages to disk so the file stays lean. - Cap row count as a burst guard.
InferenceObservability:MaxRows(default 5,000,000) trims to the newest N rows even inside the time window, so a sudden high-rate spike cannot fill the disk before the time-based sweep runs. Lower it if your box has limited disk. - Be careful with throughput/stub testing. High-rate write bursts (for example sensor-pipeline throughput tests) are what inflate the file in the first place. Avoid leaving a high-rate test running against persisted storage on a production box.
Related
Section titled “Related”- Observability & Alerts — what the observability database stores.