Skip to content

Deploying Models

Models run on the Xisom runtime in ONNX format. You upload a model, pair it with an input datasource, and activate it to start inference.

  • ONNX — the runtime accepts ONNX models and executes them with the execution provider that matches your hardware (TensorRT, CUDA, or CPU). See Hardware Setup for the modes.
  1. Go to Models and upload your .onnx file. The platform validates the file (extension, type, and schema) before it is available.

  2. On an input datasource, pair this model. The compatibility check runs here — if the window size or feature count disagrees, the pairing is rejected. Adjust the datasource window size or regenerate the model to match.

  3. Enable the datasource, then Start streaming. The runtime loads the paired model into the inference service and begins producing predictions. See Your First Inference for the full loop.

A fresh install ships a demo model with random weights — predictions are meaningless until you replace it. Upload your trained model as above, or drop it into the model-data directory and restart the inference service. The offline-bundle install page covers the file-drop path: Replace the demo model (step 5).

  • Upload rejected, pairing fails, or the runtime won’t load the model — see the Troubleshooting runbook.