Create a Dataset Version
Creating a dataset version allows you to prepare your data for training a model.
Last updated
Creating a dataset version allows you to prepare your data for training a model.
Last updated
A dataset version is a point-in-time snapshot of your dataset. By keeping track of the exact images, preprocessing steps, and augmentations used in each model iteration, you maintain the ability to reproduce results. This ensures that you can scientifically test various models and frameworks while confidently attributing results to model changes, not to bugs or changes in the data pipeline.
Once a version is created, it is frozen in time. This means that any subsequent changes to the project (such as adding or removing images, annotations, or other data) will not affect previously created versions.
To create a dataset version:
I. Click "Versions" in the sidebar of your BinaExperts project.
II. Click "Build New Version".
From this page, you can:
· Set the train/test/validation split.
· Specify preprocessing steps.
· Define augmentations for your new dataset version.
After specifying the preprocessing steps and augmentations you want to apply, click "Build Train-Ready Version". This will generate a new dataset version. You can then use this dataset version to train a model in BinaExperts or export it for manual model training.
During the version creation process, you can readjust the balance of your training, validation, and test sets. To do this:
· Go to "Step 4: Train/Test Split".
· Click on the option to adjust the split settings.