Dataset Management

Once you’ve successfully created your dataset, our platform offers a range of tools to manage and optimize your data. The dataset creation process begins on the backend as soon as you click Create. You will receive notifications about the progress of your dataset via email and platform notifications.

After processing, your datasets will be available on the Dataset Dashboard. By clicking on the dataset name in dataset table, you’ll be taken to the dataset’s management page, where you can perform several actions across different tabs.

Dashboard View

1. Summary

The Summary tab provides an overview of your dataset, including essential details and visualizations:

  • Number of Classes: See how many distinct classes are present in your dataset.

  • Total Number of Images: View the total image count within the dataset.

  • Version Details: Access information about the current dataset version.

Additionally, you can view your recent actions within the current dataset of your project. This feature provides quick insights into your latest tasks associated with your datasets, helping you keep track of changes and updates.

This tab also includes data visualizations to help you understand your dataset better:

  • Split Distribution: Visualize how your dataset is divided across Train, Test, Val, and Unassigned categories.

  • Class Distribution: Analyze the distribution of images across different classes.

Dataset Summary

2. Add Data

The Add Data tab allows you to update your existing dataset by uploading additional images. The process follows the same steps as the initial dataset creation, whether you’re uploading from a local source or the cloud.

Add Dataset

Add Dataset

Add Dataset

Add Dataset

3. Preview

In the Preview tab, you can explore the processed images in your dataset. This tab allows you to inspect individual images in detail, including:

  • Bounding Boxes: Check the bounding boxes drawn around objects in the images.

  • Classification Labels: View the labels assigned to each image or object.

  • Segmentation Masks: Inspect the segmentation masks applied to images.

Dataset Preview

Simply click on an image for a closer, more detailed view.

Dataset Preview

4. Analysis

The Analysis tab provides deeper insights into your dataset through advanced visualizations:

  • Image Height vs Width: Analyze the dimensions of your images.

  • Bounding Boxes Height vs Width: Understand the size distribution of bounding boxes.

  • Bounding Box Location: Visualize where bounding boxes are located within the images.

Dataset Analysis

Dataset Analysis

Dataset Analysis

Dataset Analysis

5. Data Splitting

The Data Splitting tab offers two methods to split your dataset into Train, Test, Val, and Unassigned folders after processing:

Data Split

  1. New Random Splits: Specify the desired percentage for each split (e.g., Train, Test, Val), and the dataset will be randomly divided according to your inputs.

Data Split

  1. Add to Existing Splits: Instead of random splitting, you can manually move a specific number of images from one split to another (e.g., transferring 500 images from the Train split to the Test split).

Data Split

After selecting the split, choose the compute resource to launch the data split process.

Data Split

6. Logs

The Logs tab keeps track of all actions and operations performed on your dataset. This log provides valuable insights and details about the dataset’s processing history, helping you understand what changes have been made over time.

Dataset Logs


By utilizing these management features, you can ensure that your dataset is perfectly tailored to your project’s needs, giving you the flexibility and control required for successful machine learning workflows.