synicix_ml_pipeline.datajoint_tables package¶
Submodules¶
synicix_ml_pipeline.datajoint_tables.BaseTable module¶
-
class
synicix_ml_pipeline.datajoint_tables.BaseTable.
BaseTable
[source]¶ Bases:
object
Parent table for all datajoint classes which contains some useful function typcially used in tables and the schema pointer
All datajoint tables part of the main framework will inherite from this table.
-
classmethod
check_if_tuple_in_table
(tuple_dict)[source]¶ Check if a given table has an entry that meets the tuple_dict condition. If exists, return True else False
- Parameters:
- table: datajoint table reference tuple_dict: dictionary containing the values of the table attributes
- Returns:
- bool: whether the table has the entry that meets tuple_dict or not
-
static
compute_md5_hash
(tuple_dict)[source]¶ Utility helper function to compute the md5 hash given the tuple_dict
- Parameters:
- tuple_dict (dict): dictionary to hash
- Returns:
- str: 128 byte md5 hash string
-
static
import_class_from_module
(module_name, class_name)[source]¶ Helper function to handle import of classes from modules
- Parameters:
- module_name (str): name of the module containing the target class to import model_class_name (str): name of class to import from module
- Returns:
- user_defined_class: class imported from module
-
classmethod
insert_tuples
(tuple_dicts)[source]¶ Insert a tuple by checking the highest id number based on “primary_key_id” When inserting, increment “primary_key_id” by 1
- Parameters:
- tuple_dict: either a dictionary or a list of dictionaries containing
- the values of the class attributes
- Returns:
- None
-
classmethod
synicix_ml_pipeline.datajoint_tables.DatasetConfig module¶
-
class
synicix_ml_pipeline.datajoint_tables.DatasetConfig.
DatasetConfig
(dataset_dir=None, dataset_cache_dir=None)[source]¶ Bases:
datajoint.user_tables.Manual
,synicix_ml_pipeline.datajoint_tables.BaseTable.BaseTable
A dj.Manual table class that handle the storage dataset configs with details on what dataset class and params to load that dataset and dataloader with.
Initializing of this class will require the corresponding dataset_dir and dataset_cache_dir to be defined and passed into the __init__()
Typical usage of this class is done by using the method insert_tuples. An example of this can be found in the Pipeline Configuration Jupyter Notebook
Table definition:
dataset_config_id : int unsigned — dataset_file_name : varchar(256) dataset_type : varchar(256) dataset_class_module_name : varchar(256) dataset_class_name : varchar(256) dataset_class_params : longblob train_sampler_module_name : varchar(256) train_sampler_class_name : varchar(256) train_sampler_class_params : longblob validation_sampler_module_name : varchar(256) validation_sampler_class_name : varchar(256) validation_sampler_class_params : longblob test_sampler_module_name : longblob test_sampler_class_name : varchar(256) test_sampler_class_params : longblob input_shape : longblob output_shape : longblob additional_model_params : longblob dataset_config_blobs_md5_hash : char(128)-
get_dataloaders
(key, batch_size, num_workers=0)[source]¶ Method to build the dataloaders base off of the primary key and return it.
- Parameters:
- key (dict): DatasetConfig datajoint table priamry key. batch_size (int): Batchsize of the train, validation, test dataloaders classes. num_workers (int): Number of pytorch dataloader workers to use for the dataloaders.
- Returns:
- pytorch.dataloader, pytorch.dataloader, pytorch.dataloader: Train, validation, and test dataloaders
-
insert_tuples
(tuple_dicts)[source]¶ Function to compute the hash, build the dataloader, get the input_shape, output_shape, and additional_model_params, and insert into DatasetConfig DJ table
- Parameters:
- tuple_dicts: A list of dict containing the attribute to be inserted in to DatasetConfig
- Returns:
- None
-
synicix_ml_pipeline.datajoint_tables.ModelConfig module¶
-
class
synicix_ml_pipeline.datajoint_tables.ModelConfig.
ModelConfig
(arg=None)[source]¶ Bases:
datajoint.user_tables.Manual
,synicix_ml_pipeline.datajoint_tables.BaseTable.BaseTable
A dj.Manual table class that handle the storage of pytorch models definition along with some helper function to help load the models
Typical usage of this class is done by using the method insert_tuples. An example of this can be found in the Pipeline Configuration Jupyter Notebook
Table definition:
model_config_id : int unsigned # MD5 Hash of network_class_name + network_module_code — model_class_module_name : varchar(256) model_class_name : varchar(256) # Class name of the network model_class_params : longblob model_config_blobs_md5_hash : char(128)-
classmethod
get_model_class_and_params
(key)[source]¶ Function to get the model and model_class params given a key
- Parameters:
- key (dict): A dictionary to restrict ModelConfig by down to one tuple
- Returns:
- <user_defined_model_class>, <user_defined_model_class_params>: Returns the model_class and model_class_params based on the key
-
classmethod
synicix_ml_pipeline.datajoint_tables.TrainingConfig module¶
-
class
synicix_ml_pipeline.datajoint_tables.TrainingConfig.
TrainingConfig
(arg=None)[source]¶ Bases:
datajoint.user_tables.Manual
,synicix_ml_pipeline.datajoint_tables.BaseTable.BaseTable
A dj.Manual table class that handle the storage of training configuration such the trainer class to use and various other parameters realted to that.
Typical usage of this class is done by using the method insert_tuples. An example of this can be found in the Pipeline Configuration Jupyter Notebook
Table definition:
training_config_id : int unsigned — trainer_class_module_name : varchar(256) trainer_class_name : varchar(256) trainer_class_params : longblob batch_size : smallint unsigned epoch_limit : int unsigned optimizer_class_module_name : varchar(256) optimizer_class_name : varchar(256) optimizer_class_params : longblob criterion_class_module_name : varchar(256) criterion_class_name : varchar(256) criterion_class_params : longblob training_config_blobs_md5_hash : char(128)-
classmethod
get_criterion_class_and_params
(key)[source]¶ Function to load the criterion_class and criterion_class_params from the DB base on the key given
- Parameters:
- key (dict): key to restrict TrainingConfig by into one tuple
- Returns:
- <user_defined_criterion_class>, <user_defined_criterion_class_params>: Returns the criterion_class and criterion_class_params for the given key
-
classmethod
get_optimizer_class_and_params
(key)[source]¶ Function to load the optimizer_class and optimizer_class_params from the DB base on the key given
- Parameters:
- key (dict): key to restrict TrainingConfig by into one tuple
- Returns:
- <user_defined_optimizer_class>, <user_defined_optimizer_class_params>: Returns the optimzer_class and optimizer_class_params for the given key
-
classmethod
synicix_ml_pipeline.datajoint_tables.TrainingResult module¶
synicix_ml_pipeline.datajoint_tables.TrainingTask module¶
-
class
synicix_ml_pipeline.datajoint_tables.TrainingTask.
TrainingTask
(arg=None)[source]¶ Bases:
datajoint.user_tables.Manual
,synicix_ml_pipeline.datajoint_tables.BaseTable.BaseTable
A dj.Manual table class that handle the storage of training tasks which is a subset of all possiable combination of DatasetConfig, ModelConfig and DatasetConfig
Typical usage of this class is done by using the method insert_tuples. An example of this can be found in the Pipeline Configuration Jupyter Notebook
Table definition:
training_task_id : int unsigned — -> DatasetConfig -> ModelConfig -> TrainingConfig trial : smallint unsigned