Skip to content

Disk fills / database grows unbounded

The box runs low on disk, and the observability database file is enormous relative to how much data it should hold — for example tens of GB on disk while only ~1 GB of predictions are actually retained. Deleting old rows (or shortening retention) does not shrink the file.

The cause: old rows are deleted on schedule, but on a database created without incremental auto-vacuum the freed pages are kept in an internal free-list and never returned to the operating system. The file stays at its high-water mark forever.

Check the database file size inside the backend container against the expected live size:

Terminal window
docker compose -f docker-compose.release.yml exec backend \
sh -c 'ls -lh /data/aiboard.db'

If the file is many times larger than the volume of data your retention window should hold (a multi-GB file for a few days of predictions), you have free-list bloat. A single high-rate write burst can inflate the file far past steady-state size.

Current releases reclaim space automatically after each retention sweep, so a healthy box self-corrects. A database that bloated before that behavior was in place needs a one-time reclaim.

  1. Stop the backend so the database is not being written during the reclaim:

    Terminal window
    docker compose -f docker-compose.release.yml stop backend
  2. Run the one-time reclaim against the database file. This converts the file to incremental auto-vacuum and compacts it, returning the free-list pages to the OS:

    Terminal window
    docker compose -f docker-compose.release.yml run --rm --entrypoint sh backend -c \
    'sqlite3 /data/aiboard.db "PRAGMA auto_vacuum=INCREMENTAL; VACUUM;"'
  3. Verify the file shrank and the data is intact:

    Terminal window
    docker compose -f docker-compose.release.yml run --rm --entrypoint sh backend -c \
    'ls -lh /data/aiboard.db; sqlite3 /data/aiboard.db "PRAGMA integrity_check;"'
    # expect a much smaller file and: ok
  4. Start the backend again:

    Terminal window
    docker compose -f docker-compose.release.yml start backend
  • Keep retention bounded. The retention window (InferenceObservability:RetentionDays, default 3 days) deletes old prediction rows on a schedule; the post-delete reclaim returns the freed pages to disk so the file stays lean.
  • Cap row count as a burst guard. InferenceObservability:MaxRows (default 5,000,000) trims to the newest N rows even inside the time window, so a sudden high-rate spike cannot fill the disk before the time-based sweep runs. Lower it if your box has limited disk.
  • Be careful with throughput/stub testing. High-rate write bursts (for example sensor-pipeline throughput tests) are what inflate the file in the first place. Avoid leaving a high-rate test running against persisted storage on a production box.