Experiment Configs ================== The most comprehensive way to config an experiment in ``minerva`` is to use ``YAML`` files. This guide will walk through all the possible config options and their structure. It will also explain how to use the config file for an experiment. A good example to look at is ``example_config.yaml`` .. code-block:: yaml :caption: The ``example_config.yml`` demonstrating how to construct a master config to define an experiment in ``minerva``. --- # * * __ ________ ____________ _ _____ # * * / |/ / _/ | / / ____/ __ \ | / / | * * # * / /|_/ // // |/ / __/ / /_/ / | / / /| | * # * * / / / // // /| / /___/ _, _/| |/ / ___ | * # * /_/ /_/___/_/ |_/_____/_/ |_| |___/_/ |_| * * # # EXAMPLE MASTER CONFIG FILE # # === PATHS =================================================================== data_root: tests/fixtures/data results_dir: tests/tmp/results cache_dir: tests/tmp/cache # === HYPERPARAMETERS ========================================================= # ---+ Model Specification +--------------------------------------------------- # Name of model. This no longer used for model class (see model_params). model_name: FCN32ResNet18-test # Type of model. Can be mlp, scene classifier, segmentation, ssl or siamese. model_type: segmentation # ---+ Sizing +---------------------------------------------------------------- batch_size: 8 # Number of samples in each batch. input_size: [4, 32, 32] # patch_size plus leading channel dim. patch_size: '${to_patch_size: ${input_size}}' # 2D tuple or float. n_classes: 8 # Number of classes in dataset. # ---+ Experiment Execution +-------------------------------------------------- max_epochs: 4 # Maximum number of training epochs. pre_train: false # Activate pre-training mode. fine_tune: false # Activate fine-tuning mode. elim: true # Eliminates empty classes from schema. balance: true # Balances dataset classes. torch_compile: true # Wrap model in `torch.compile`. # ---+ Optimisers +--------------------------------------------------- lr: 1.0E-2 # Learning rate of optimiser. optim_func: SGD # Name of the optimiser function. # ---+ Model Parameters +------------------------------------------------------ model_params: _target_: minerva.models.FCN32ResNet18 input_size: ${input_size} n_classes: ${n_classes} # any other params... # ---+ Optimiser Parameters +-------------------------------------------------- optimiser: _target_: torch.optim.${optim_func} lr: ${lr} # ---+ Scheduler Parameters +-------------------------------------------------- scheduler: _target_: torch.optim.lr_scheduler.LinearLR start_factor: 1.0 end_factor: 0.5 total_iters: 5 # ---+ Loss Function Parameters +---------------------------------------------- loss_params: _target_: torch.nn.CrossEntropyLoss # ---+ Dataloader Parameters +------------------------------------------------- loader_params: num_workers: 0 pin_memory: true # === MODEL IO & LOGGING ====================================================== # ---+ wandb Logging +--------------------------------------------------------- wandb_log: true # Activates wandb logging. project: pytest # Define the project name for wandb. wandb_dir: /test/tmp/wandb # Directory to store wandb logs locally. # ---+ Collator +-------------------------------------------------------------- collator: torchgeo.datasets.stack_samples # === TASKS =================================================================== tasks: fit-train: _target_: minerva.tasks.StandardEpoch train: true record_float: true imagery_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/NAIP.yaml}}' # yamllint disable-line rule:line-length data_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/Chesapeake7.yaml}}' # yamllint disable-line rule:line-length # ---+ Dataset Parameters +---------------------------------------- dataset_params: sampler: _target_: torchgeo.samplers.RandomGeoSampler roi: false size: ${patch_size} length: 32 image: transforms: false subdatasets: images_1: _target_: minerva.datasets.__testing.TstImgDataset paths: NAIP res: 1.0 image2: _target_: minerva.datasets.__testing.TstImgDataset paths: NAIP res: 1.0 mask: transforms: false _target_: minerva.datasets.__testing.TstMaskDataset paths: Chesapeake7 res: 1.0 fit-val: _target_: minerva.tasks.StandardEpoch train: false record_float: true imagery_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/NAIP.yaml}}' # yamllint disable-line rule:line-length data_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/Chesapeake7.yaml}}' # yamllint disable-line rule:line-length # ---+ Minerva Inbuilt Logging Functions +------------------------- task_logger: minerva.logger.tasklog.SupervisedTaskLogger model_io: minerva.modelio.supervised_torchgeo_io # ---+ Dataset Parameters +---------------------------------------- dataset_params: sampler: _target_: torchgeo.samplers.RandomGeoSampler roi: false size: ${patch_size} length: 32 image: transforms: false _target_: minerva.datasets.__testing.TstImgDataset paths: NAIP res: 1.0 mask: transforms: false _target_: minerva.datasets.__testing.TstMaskDataset paths: Chesapeake7 res: 1.0 test-test: _target_: minerva.tasks.StandardEpoch record_float: true imagery_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/NAIP.yaml}}' # yamllint disable-line rule:line-length data_config: '${oc.create:${cfg_load: minerva/inbuilt_cfgs/dataset/Chesapeake7.yaml}}' # yamllint disable-line rule:line-length # ---+ Minerva Inbuilt Logging Functions +------------------------- task_logger: minerva.logger.tasklog.SupervisedTaskLogger model_io: minerva.modelio.supervised_torchgeo_io # ---+ Dataset Parameters +---------------------------------------- dataset_params: sampler: _target_: torchgeo.samplers.RandomGeoSampler roi: false size: ${patch_size} length: 32 image: transforms: false _target_: minerva.datasets.__testing.TstImgDataset paths: NAIP res: 1.0 mask: transforms: false _target_: minerva.datasets.__testing.TstMaskDataset paths: Chesapeake7 res: 1.0 # === PLOTTING OPTIONS ======================================================== plots: History: true # Plot of the training and validation metrics over epochs. CM: true # Confusion matrix. Pred: true # Pie chart of the distribution of the predicted classes. ROC: true # Receiver Operator Characteristics for each class. micro: true # Include micro averaging in ROC plot. macro: true # Include macro averaging in ROC plot. Mask: true # Plot predicted masks against ground truth and imagery. # === MISCELLANEOUS OPTIONS =================================================== # ---+ Early Stopping +-------------------------------------------------------- stopping: patience: 1 # No. of val epochs with increasing loss before stopping. verbose: true # Verbosity of early stopping prints to stdout. # ---+ Verbosity and Saving +-------------------------------------------------- verbose: true # Verbosity of Trainer print statements to stdout. save: true # Saves created figures to file. show: false # Shows created figures in a pop-up window. p_dist: true # Shows the distribution of classes to stdout. plot_last_epoch: true # Plot the results of the last training and val epochs. # opt to ask at runtime; auto or True to automatically do so; or False, # None etc to not save_model: true # ---+ Other +----------------------------------------------------------------- # opt to ask at runtime; auto or True to automatically do so; or False, # None etc to not run_tensorboard: false calc_norm: false Paths ----- Paths to required directories are defined in the ``data_root``, ``results_dir`` and ``cache_dir`` keys. .. code-block:: yaml :caption: Example ``dir`` dictionary describing the paths to directories needed in experiment. # === PATHS =================================================================== data_root: tests/fixtures/data results_dir: tests/tmp/results cache_dir: tests/tmp/cache .. py:data:: data_root Path to the data directory where the input data is stored within. Can be relative or absolute. :type: str .. py:data:: cache_dir Path to the cache directory storing dataset manifests and a place to output the latest / best version of a model. Can be relative or absolute. :type: str .. py:data:: results_dir Path to the results directory where the results from all experiments will be stored. Can be relative or absolute. :type: str Hyperparameters --------------- This section of the config file covers hyperparmeters of the model and experiment. The most important of these are now top-level variables in the config. Most are also accessible from the CLI. Model Specification ^^^^^^^^^^^^^^^^^^^ These parameters focus on defining the model, such as class, version and type. .. code-block:: yaml # Name of model. Substring before hyphen is model class. model_name: FCN32ResNet18-MkI # Type of model. model_type: segmentation .. py:data:: model_name Name of the model. Used to create the unique ``exp_name`` that is created dynamically for each experiment run. :type: str .. py:data:: model_type Type of model. Can contain these key words seperated by hyphens: * ``"segmentation"`` * ``"scene_classifier"`` * ``"mlp"`` * ``"ssl"`` * ``"siamese"`` * ``"change_detection"`` * ``"multilabel"`` :type: str :value: "scene_classifier" Sizing ^^^^^^ These parameters concern the shapes and sizes of the IO to the model. .. code-block:: yaml batch_size: 8 # Number of samples in each batch. patch_size: [32, 32] # 2D tuple or float. input_size: [4, 32, 32] # patch_size plus leading channel dim. n_classes: 8 # Number of classes in dataset. .. py:data:: batch_size Number of samples in each batch. :type: int .. py:data:: patch_size Define the shape of the patches in the dataset. :type: Tuple[int, int] .. py:data:: input_size The :data:`patch_size` plus the leading channel dimension. :type: Tuple[int, int, int] .. py:data:: n_classes Number of possible classes in the dataset. :type: int Experiment Execution ^^^^^^^^^^^^^^^^^^^^ These parameters control the execution of the model fitting such as the number of epochs, type of job or class balancing. .. code-block:: yaml max_epochs: 5 # Maximum number of training epochs. pre_train: false # Activate pre-training mode. fine_tune: false # Activate fine-tuning mode. elim: true # Eliminates empty classes from schema. balance: true # Balances dataset classes. .. py:data:: max_epochs Maximum number of epochs of training and validation. :type: int :value: 5 .. py:data:: pre_train Defines this as a pre-train experiment. In this case, the backbone of the model will be saved to the cache at the end of training. :type: bool :value: False .. py:data:: fine_tune Defines this as a fine-tuning experiment. :type: bool :value: False .. py:data:: elim Will eliminate classes that have no samples in and reorder the class labels so they still run from ``0`` to ``n-1`` classes where ``n`` is the reduced number of classes. ``minerva`` ensures that labels are converted between the old and new schemes seamlessly. :type: bool :value: False .. py:data:: balance Activates class balancing. For ``model_type="scene_classifer"`` or ``model_type="mlp"``, over and under sampling will be used. For ``model_type="segmentation"``, class weighting will be used on the loss function. :type: bool :value: False Loss and Optimisers ^^^^^^^^^^^^^^^^^^^ These parameters set the most important aspects of the loss function and optimiser. .. code-block:: yaml loss_func: CrossEntropyLoss # Name of the loss function to use. lr: 1.0E-2 # Learning rate of optimiser. optim_func: SGD # Name of the optimiser function. .. py:data:: loss_func Name of the loss function to use. :type: str .. py:data:: lr Learning rate of the optimiser :type: float .. py:data:: optim_func Name of the optimiser function. :type: str Model Paramaters ^^^^^^^^^^^^^^^^ These are the parameters parsed to the model class to initiate it. .. code-block:: yaml model_params: _target_: minerva.models.FCN32ResNet18 input_size: ${input_size} n_classes: ${n_classes} # any other params... Two common parameters are: .. py:data:: input_size :noindex: Shape of the input to the model. Typically in CxHxW format. Should align with the values given for ``patch_size``. :type: list .. py:data:: n_classes :noindex: Number of possible classes to predict in output. Best to parse :data:`n_classes` using ``${n_classes}``. :type: int But you can add any other parameters in the ``model_params`` dict that the model expects. Optimiser Parameters ^^^^^^^^^^^^^^^^^^^^ Here's where to place any additional parameters for the optimiser, other than the already handled learning rate -- ``lr``. Place them in the ``params`` key. If using a non-torch optimiser, use the ``module`` key to specify the import path to the optimiser function. .. code-block:: yaml optimiser: _target_: torch.optim.${optim_func} lr: ${lr} Loss Paramaters ^^^^^^^^^^^^^^^ Here's where to specify any additional parameters for the loss function in the ``params`` key. If using a non-torch loss function, you need to specify the import path with the ``module`` key. .. code-block:: yaml loss: _target_: torch.nn.${loss_func} # any other params... Dataloader Paramaters ^^^^^^^^^^^^^^^^^^^^^ Finally, this is where to define parameters for the :class:`~torch.utils.data.DataLoader`. Unlike other parameters, there is no ``_target_`` field as it is locked to ``DataLoader``. .. code-block:: yaml loader_params: num_workers: 1 pin_memory: true Model IO & Logging ------------------ These parameters allow for the configuring how to handle different types of input/ output to the model and how to handle logging of the model. wandb Logging ^^^^^^^^^^^^^ Here's where to define how Weights and Biases (``wandb``) behaves in ``minerva``. .. code-block:: yaml wandb_log: true # Activates wandb logging. project: pytest # Define the project name for wandb. wandb_dir: /test/tmp/wandb # Directory to store wandb logs locally. Minerva Inbuilt Logging Functions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In addition, there are also options for defining the logging, metric calculator and IO function using inbuilt ``minerva`` functionality: .. code-block:: yaml task_logger: minerva.logger.tasklog.SupervisedTaskLogger step_logger: _target_: minerva.logger.steplog.SupervisedStepLogger # any other params... model_io: minerva.modelio.supervised_torchgeo_io record_int: true # Store integer results in memory. record_float: true # Store floating point results too. Beware memory overload! .. py:data:: logger :noindex: Specify the logger to use. Must be the name of a :class:`~minerva.logger.MinervaLogger` class within :mod:`logger`. :type: str .. py:data:: metrics :noindex: Specify the metric logger to use. Must be the name of a :class:`~minerva.metrics.MinervaMetrics` class within :mod:`metrics`. :type: str .. py:data:: model_io Specify the IO function to use to handle IO for the model during fitting. Must be the name of a function within :mod:`modelio`. :type: str .. py:data:: record_int Store the integer results of each epoch in memory such the predictions, ground truth etc. :type: bool .. py:data:: record_float Store the floating point results of each epoch in memory such as the raw predicted probabilities. .. warning:: Could cause a memory overload issue with large datasets or systems with small RAM capacity. Collator ^^^^^^^^ The collator is the function that collates the samples from the datset to make a mini-batch. It can be defined using the simple ``collator`` param at the global-level. .. code-block:: yaml collator: torchgeo.datasets.stack_samples .. py:data:: collator Dot-based import path to the desired collator. :type: str Plots Dictionary ---------------- To define which plots to make from the results of testing, use the ``plots`` sub-dictionary with these keys: .. code-block:: yaml :caption: Example ``plots`` dictionary. plots: History: True CM: False Pred: False ROC: False micro: False macro: True Mask: False .. py:data:: History Plot a graph of the model history. By default, this will plot a graph of any metrics with keys containing ``"train"`` or ``"val"``. :type: bool .. py:data:: CM Plots a confusion matrix. :type: bool .. py:data:: Pred Plots a pie chart of the relative sizes of the classes within the predictions from the model. :type: bool .. py:data:: ROC Plots a *Receiver over Operator Curve* (ROC) including *Area Under Curve* (AUC) scores. :type: bool .. py:data:: micro Only used with ``ROC=True``. ROC plot includes micro-average ROC. .. warning:: Adding this plot can be very computationally and memory intensive. Avoid use with large datasets! :type: bool .. py:data:: macro Only used with ``ROC=True``. ROC plot includes macro-average ROC. :type: bool .. py:data:: Mask Plots a comparison of predicted segmentation masks, the ground truth and original RGB imagery from a random selection of samples put to the model. :type: bool Miscellaneous Options --------------------- And finally, this section holds various other options. Early Stopping ^^^^^^^^^^^^^^ Here's where to define the behaviour of early stopping functionality. .. code-block:: yaml stopping: patience: 2 # No. of val epochs with increasing loss before stopping. verbose: true # Verbosity of early stopping prints to stdout. .. py:data:: stopping Dictionary to hold the parameters defining the early stopping functionality. If no dictionary is given, it is assumed that there will be no early stopping. :type: dict .. py:data:: patience Number of validation epochs with increasing loss from the lowest recorded validation loss before stopping the experiment. :type: int .. py:data:: verbose :noindex: Verbosity of the early stopping prints to stdout. :type: bool Verbosity and Saving ^^^^^^^^^^^^^^^^^^^^ These parameters dictate the behaviour of the outputs to stdout and saving results. .. code-block:: yaml verbose: true # Verbosity of Trainer print statements to stdout. save: true # Saves created figures to file. show: false # Shows created figures in a pop-up window. p_dist: true # Shows the distribution of classes to stdout. plot_last_epoch: true # Plot the results of the last training and val epochs. # opt to ask at runtime; auto or True to automatically do so; or False, # None etc to not save_model: true .. py:data:: verbose Verbosity of :class:`~trainer.Trainer` prints to stdout. :type: bool .. py:data:: save Whether to save plots created to file or not. :type: bool :value: True .. py:data:: show Whether to show plots created in a window or not. .. warning:: Do not use with a terminal-less operation, e.g. SLURM. :type: bool :value: False .. py:data:: p_dist Whether to print the distribution of classes within the data to ``stdout``. :type: bool :value: False .. py:data:: plot_last_epoch Whether to plot the results from the final validation epoch. :type: bool :value: False .. py:data:: save_model Whether to save the model at end of testing. Must be ``True``, ``False`` or ``"auto"``. Setting ``"auto"`` will automatically save the model to file. ``True`` will ask the user whether to or not at runtime. ``False`` will not save the model and will not ask the user at runtime. :type: str | bool :value: False Other ^^^^^ All other options belong in this section. .. code-block:: yaml # opt to ask at runtime; auto or True to automatically do so; or False, # None etc to not run_tensorboard: false calc_norm: false .. py:data:: run_tensorboard Whether to run the Tensorboard logs at end of testing. Must be ``True``, ``False`` or ``"auto"``. Setting ``"auto"`` will automatically locate and run the logs on a local browser. ``True`` will ask the user whether to or not at runtime. ``False`` will not save the model and will not ask the user at runtime. :type: str | bool :value: False .. py:data:: calc_norm *Depreciated*: Calculates the gradient norms. :type: bool :value: False