Troubleshooting

Find your symptom in the table, jump to the fix. Every runbook page follows the same shape: Symptom → Confirm → Fix → Prevent.

Find your symptom

If you see…	Go to
GPU installed but inference runs on CPU; log shows `CUDA failure 500: named symbol not found`	GPU not used — CUDA error 500
Inference container exits/restarts when GPU is enforced and the host GPU is broken	GPU not used — CUDA error 500
Host GPU works, but provider shows a lower tier than expected (e.g. CUDA instead of TensorRT, or CPU)	Running on CPU when GPU expected
Log line `Failed to load library libonnxruntime_providers_tensorrt.so`	Running on CPU when GPU expected
Enabling a datasource bounces back to Disabled with `model_not_paired` or `adapter_connect_failed`	Datasource down or faulted
A streaming OPC-UA / MQTT / CSV source stops; row shows a fault indicator / red banner	Datasource down or faulted
Datasource is Enabled but no predictions appear	Datasource down or faulted
Settings → About highlights the frontend / backend version in amber	Frontend / backend version mismatch
Version reads `0.0.0-dev…` or ends in `-dirty`	Frontend / backend version mismatch
Disk filling up; database file far larger than the data it holds; deleting rows doesn’t shrink it	Disk fills / database grows unbounded
A container shows `unhealthy` in `docker ps` but the app serves fine	Container marked unhealthy but service works