Understanding Inference

Inference is the process of making predictions or generating outputs using a trained machine learning model. In production environments, inference refers to deploying the model and using it to make decisions in real-time or batch processing.

Why is Inference Important?

  • Efficiency: It enables the use of trained models for real-time applications like object detection, natural language processing, and more.

  • Scalability: Inference allows models to be deployed across various platforms and devices, ensuring they work efficiently at scale.

  • Optimization: Optimizing models during inference can reduce latency and resource consumption, making them suitable for deployment on edge devices.

In Matrice.ai, the Inference tab provides a streamlined process to export, optimize, and evaluate models for production-ready deployment.

Step-by-Step Guide to export the model using Matrice.ai

This section will guide you through the steps to export and optimize your models on Matrice.ai.

  1. Navigate to the Inference Tab from the dashboard :

Inference Tab

  1. Select the Create Inference button to open the inference form :

Inference Tab

  1. Enter the required information to add models for inference :

    • Choose a existing trained model to export from the dropdown menu.

    • Select the desired export format (e.g., ONNX, OpenVINO) and adjust hyperparameters as needed.

    • Choose the required hyperparameters for your model export.

Model Selection

Export Process

  1. Export the Model by launching a new or dedicated instance

Export Process

  • Click on Create button to add your models to export

Export Process

Note

Export your models seamlessly with our Python SDK for code-driven workflows or directly through the platform interface. Pick the approach that best fits your workflow and get started with exporting in no time!