Experiment Configsο
The most comprehensive way to config an experiment in minerva
is to use YAML
files.
This guide will walk through all the possible config options and their structure.
It will also explain how to use the config file for an experiment.
A good example to look at is example_config.yaml
---
# * * __ ________ ____________ _ _____
# * * / |/ / _/ | / / ____/ __ \ | / / | * *
# * / /|_/ // // |/ / __/ / /_/ / | / / /| | *
# * * / / / // // /| / /___/ _, _/| |/ / ___ | *
# * /_/ /_/___/_/ |_/_____/_/ |_| |___/_/ |_| * *
#
# EXAMPLE MASTER CONFIG FILE
#
# === PATHS ===================================================================
data_root: tests/fixtures/data
results_dir: tests/tmp/results
cache_dir: tests/tmp/cache
# === HYPERPARAMETERS =========================================================
# ---+ Model Specification +---------------------------------------------------
# Name of model. This no longer used for model class (see model_params).
model_name: FCN32ResNet18-test
# Type of model. Can be mlp, scene classifier, segmentation, ssl or siamese.
model_type: segmentation
# ---+ Sizing +----------------------------------------------------------------
batch_size: 8 # Number of samples in each batch.
input_size: [4, 32, 32] # patch_size plus leading channel dim.
patch_size: '${to_patch_size: ${input_size}}' # 2D tuple or float.
n_classes: 8 # Number of classes in dataset.
# ---+ Experiment Execution +--------------------------------------------------
max_epochs: 4 # Maximum number of training epochs.
pre_train: false # Activate pre-training mode.
fine_tune: false # Activate fine-tuning mode.
elim: true # Eliminates empty classes from schema.
balance: true # Balances dataset classes.
torch_compile: true # Wrap model in `torch.compile`.
# ---+ Optimisers +---------------------------------------------------
lr: 1.0E-2 # Learning rate of optimiser.
optim_func: SGD # Name of the optimiser function.
# ---+ Model Parameters +------------------------------------------------------
model_params:
_target_: minerva.models.FCN32ResNet18
input_size: ${input_size}
n_classes: ${n_classes}
# any other params...
# ---+ Optimiser Parameters +--------------------------------------------------
optimiser:
_target_: torch.optim.${optim_func}
lr: ${lr}
# ---+ Scheduler Parameters +--------------------------------------------------
scheduler:
_target_: torch.optim.lr_scheduler.LinearLR
start_factor: 1.0
end_factor: 0.5
total_iters: 5
# ---+ Loss Function Parameters +----------------------------------------------
loss_params:
_target_: torch.nn.CrossEntropyLoss
# ---+ Dataloader Parameters +-------------------------------------------------
loader_params:
num_workers: 0
pin_memory: true
# === MODEL IO & LOGGING ======================================================
# ---+ wandb Logging +---------------------------------------------------------
wandb_log: true # Activates wandb logging.
project: pytest # Define the project name for wandb.
wandb_dir: /test/tmp/wandb # Directory to store wandb logs locally.
# ---+ Collator +--------------------------------------------------------------
collator: torchgeo.datasets.stack_samples
# === TASKS ===================================================================
tasks:
fit-train:
_target_: minerva.tasks.StandardEpoch
train: true
record_float: true
imagery_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/NAIP.yaml}}' # yamllint disable-line rule:line-length
data_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/Chesapeake7.yaml}}' # yamllint disable-line rule:line-length
# ---+ Dataset Parameters +----------------------------------------
dataset_params:
sampler:
_target_: torchgeo.samplers.RandomGeoSampler
roi: false
size: ${patch_size}
length: 32
image:
transforms: false
subdatasets:
images_1:
_target_: minerva.datasets.__testing.TstImgDataset
paths: NAIP
res: 1.0
image2:
_target_: minerva.datasets.__testing.TstImgDataset
paths: NAIP
res: 1.0
mask:
transforms: false
_target_: minerva.datasets.__testing.TstMaskDataset
paths: Chesapeake7
res: 1.0
fit-val:
_target_: minerva.tasks.StandardEpoch
train: false
record_float: true
imagery_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/NAIP.yaml}}' # yamllint disable-line rule:line-length
data_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/Chesapeake7.yaml}}' # yamllint disable-line rule:line-length
# ---+ Minerva Inbuilt Logging Functions +-------------------------
task_logger: minerva.logger.tasklog.SupervisedTaskLogger
model_io: minerva.modelio.supervised_torchgeo_io
# ---+ Dataset Parameters +----------------------------------------
dataset_params:
sampler:
_target_: torchgeo.samplers.RandomGeoSampler
roi: false
size: ${patch_size}
length: 32
image:
transforms: false
_target_: minerva.datasets.__testing.TstImgDataset
paths: NAIP
res: 1.0
mask:
transforms: false
_target_: minerva.datasets.__testing.TstMaskDataset
paths: Chesapeake7
res: 1.0
test-test:
_target_: minerva.tasks.StandardEpoch
record_float: true
imagery_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/NAIP.yaml}}' # yamllint disable-line rule:line-length
data_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/Chesapeake7.yaml}}' # yamllint disable-line rule:line-length
# ---+ Minerva Inbuilt Logging Functions +-------------------------
task_logger: minerva.logger.tasklog.SupervisedTaskLogger
model_io: minerva.modelio.supervised_torchgeo_io
# ---+ Dataset Parameters +----------------------------------------
dataset_params:
sampler:
_target_: torchgeo.samplers.RandomGeoSampler
roi: false
size: ${patch_size}
length: 32
image:
transforms: false
_target_: minerva.datasets.__testing.TstImgDataset
paths: NAIP
res: 1.0
mask:
transforms: false
_target_: minerva.datasets.__testing.TstMaskDataset
paths: Chesapeake7
res: 1.0
# === PLOTTING OPTIONS ========================================================
plots:
History: true # Plot of the training and validation metrics over epochs.
CM: true # Confusion matrix.
Pred: true # Pie chart of the distribution of the predicted classes.
ROC: true # Receiver Operator Characteristics for each class.
micro: true # Include micro averaging in ROC plot.
macro: true # Include macro averaging in ROC plot.
Mask: true # Plot predicted masks against ground truth and imagery.
# === MISCELLANEOUS OPTIONS ===================================================
# ---+ Early Stopping +--------------------------------------------------------
stopping:
patience: 1 # No. of val epochs with increasing loss before stopping.
verbose: true # Verbosity of early stopping prints to stdout.
# ---+ Verbosity and Saving +--------------------------------------------------
verbose: true # Verbosity of Trainer print statements to stdout.
save: true # Saves created figures to file.
show: false # Shows created figures in a pop-up window.
p_dist: true # Shows the distribution of classes to stdout.
plot_last_epoch: true # Plot the results of the last training and val epochs.
# opt to ask at runtime; auto or True to automatically do so; or False,
# None etc to not
save_model: true
# ---+ Other +-----------------------------------------------------------------
# opt to ask at runtime; auto or True to automatically do so; or False,
# None etc to not
run_tensorboard: false
calc_norm: false
Pathsο
Paths to required directories are defined in the data_root
, results_dir
and cache_dir
keys.
# === PATHS ===================================================================
data_root: tests/fixtures/data
results_dir: tests/tmp/results
cache_dir: tests/tmp/cache
- data_rootο
Path to the data directory where the input data is stored within. Can be relative or absolute.
- Type:
- cache_dirο
Path to the cache directory storing dataset manifests and a place to output the latest / best version of a model. Can be relative or absolute.
- Type:
Hyperparametersο
This section of the config file covers hyperparmeters of the model and experiment. The most important of these are now top-level variables in the config. Most are also accessible from the CLI.
Model Specificationο
These parameters focus on defining the model, such as class, version and type.
# Name of model. Substring before hyphen is model class.
model_name: FCN32ResNet18-MkI
# Type of model.
model_type: segmentation
- model_nameο
Name of the model. Used to create the unique
exp_name
that is created dynamically for each experiment run.- Type:
Sizingο
These parameters concern the shapes and sizes of the IO to the model.
batch_size: 8 # Number of samples in each batch.
patch_size: [32, 32] # 2D tuple or float.
input_size: [4, 32, 32] # patch_size plus leading channel dim.
n_classes: 8 # Number of classes in dataset.
- input_sizeο
The
patch_size
plus the leading channel dimension.
Experiment Executionο
These parameters control the execution of the model fitting such as the number of epochs, type of job or class balancing.
max_epochs: 5 # Maximum number of training epochs.
pre_train: false # Activate pre-training mode.
fine_tune: false # Activate fine-tuning mode.
elim: true # Eliminates empty classes from schema.
balance: true # Balances dataset classes.
- pre_trainο
Defines this as a pre-train experiment. In this case, the backbone of the model will be saved to the cache at the end of training.
- Type:
- Value:
False
- elimο
Will eliminate classes that have no samples in and reorder the class labels so they still run from
0
ton-1
classes wheren
is the reduced number of classes.minerva
ensures that labels are converted between the old and new schemes seamlessly.- Type:
- Value:
False
Loss and Optimisersο
These parameters set the most important aspects of the loss function and optimiser.
loss_func: CrossEntropyLoss # Name of the loss function to use.
lr: 1.0E-2 # Learning rate of optimiser.
optim_func: SGD # Name of the optimiser function.
Model Paramatersο
These are the parameters parsed to the model class to initiate it.
model_params:
_target_: minerva.models.FCN32ResNet18
input_size: ${input_size}
n_classes: ${n_classes}
# any other params...
Two common parameters are:
- input_size
Shape of the input to the model. Typically in CxHxW format. Should align with the values given for
patch_size
.- Type:
- n_classes
Number of possible classes to predict in output. Best to parse
n_classes
using${n_classes}
.- Type:
But you can add any other parameters in the model_params
dict that the model expects.
Optimiser Parametersο
Hereβs where to place any additional parameters for the optimiser,
other than the already handled learning rate β lr
. Place them in the params
key.
If using a non-torch optimiser, use the module
key to specify the import path to the optimiser function.
optimiser:
_target_: torch.optim.${optim_func}
lr: ${lr}
Loss Paramatersο
Hereβs where to specify any additional parameters for the loss function in the params
key.
If using a non-torch loss function, you need to specify the import path
with the module
key.
loss:
_target_: torch.nn.${loss_func}
# any other params...
Dataloader Paramatersο
Finally, this is where to define parameters for the
DataLoader
. Unlike other parameters, there is no _target_
field
as it is locked to DataLoader
.
loader_params:
num_workers: 1
pin_memory: true
Model IO & Loggingο
These parameters allow for the configuring how to handle different types of input/ output to the model and how to handle logging of the model.
wandb Loggingο
Hereβs where to define how Weights and Biases (wandb
) behaves in minerva
.
wandb_log: true # Activates wandb logging.
project: pytest # Define the project name for wandb.
wandb_dir: /test/tmp/wandb # Directory to store wandb logs locally.
Minerva Inbuilt Logging Functionsο
In addition, there are also options for defining the logging, metric calculator
and IO function using inbuilt minerva
functionality:
task_logger: minerva.logger.tasklog.SupervisedTaskLogger
step_logger:
_target_: minerva.logger.steplog.SupervisedStepLogger
# any other params...
model_io: minerva.modelio.supervised_torchgeo_io
record_int: true # Store integer results in memory.
record_float: true # Store floating point results too. Beware memory overload!
- logger
Specify the logger to use. Must be the name of a
MinervaLogger
class withinlogger
.- Type:
- metrics
Specify the metric logger to use. Must be the name of a
MinervaMetrics
class withinmetrics
.- Type:
- model_ioο
Specify the IO function to use to handle IO for the model during fitting. Must be the name of a function within
modelio
.- Type:
- record_intο
Store the integer results of each epoch in memory such the predictions, ground truth etc.
- Type:
- record_floatο
Store the floating point results of each epoch in memory such as the raw predicted probabilities.
Warning
Could cause a memory overload issue with large datasets or systems with small RAM capacity.
Collatorο
The collator is the function that collates the samples from the datset to make a mini-batch. It can be
defined using the simple collator
param at the global-level.
collator: torchgeo.datasets.stack_samples
Plots Dictionaryο
To define which plots to make from the results of testing, use the plots
sub-dictionary with these keys:
plots:
History: True
CM: False
Pred: False
ROC: False
micro: False
macro: True
Mask: False
- Historyο
Plot a graph of the model history. By default, this will plot a graph of any metrics with keys containing
"train"
or"val"
.- Type:
- Predο
Plots a pie chart of the relative sizes of the classes within the predictions from the model.
- Type:
- ROCο
Plots a Receiver over Operator Curve (ROC) including Area Under Curve (AUC) scores.
- Type:
- microο
Only used with
ROC=True
. ROC plot includes micro-average ROC.Warning
Adding this plot can be very computationally and memory intensive. Avoid use with large datasets!
- Type:
Miscellaneous Optionsο
And finally, this section holds various other options.
Early Stoppingο
Hereβs where to define the behaviour of early stopping functionality.
stopping:
patience: 2 # No. of val epochs with increasing loss before stopping.
verbose: true # Verbosity of early stopping prints to stdout.
- stoppingο
Dictionary to hold the parameters defining the early stopping functionality. If no dictionary is given, it is assumed that there will be no early stopping.
- Type:
- patienceο
Number of validation epochs with increasing loss from the lowest recorded validation loss before stopping the experiment.
- Type:
- verbose
Verbosity of the early stopping prints to stdout.
- Type:
Verbosity and Savingο
These parameters dictate the behaviour of the outputs to stdout and saving results.
verbose: true # Verbosity of Trainer print statements to stdout.
save: true # Saves created figures to file.
show: false # Shows created figures in a pop-up window.
p_dist: true # Shows the distribution of classes to stdout.
plot_last_epoch: true # Plot the results of the last training and val epochs.
# opt to ask at runtime; auto or True to automatically do so; or False,
# None etc to not
save_model: true
- showο
Whether to show plots created in a window or not.
Warning
Do not use with a terminal-less operation, e.g. SLURM.
- Type:
- Value:
False
- p_distο
Whether to print the distribution of classes within the data to
stdout
.- Type:
- Value:
False
- plot_last_epochο
Whether to plot the results from the final validation epoch.
- Type:
- Value:
False
Otherο
All other options belong in this section.
# opt to ask at runtime; auto or True to automatically do so; or False,
# None etc to not
run_tensorboard: false
calc_norm: false
- run_tensorboardο
Whether to run the Tensorboard logs at end of testing. Must be
True
,False
or"auto"
. Setting"auto"
will automatically locate and run the logs on a local browser.True
will ask the user whether to or not at runtime.False
will not save the model and will not ask the user at runtime.