synicix_ml_pipeline.datajoint_tables package

Submodules

synicix_ml_pipeline.datajoint_tables.BaseTable module

class synicix_ml_pipeline.datajoint_tables.BaseTable.BaseTable[source]

Bases: object

Parent table for all datajoint classes which contains some useful function typcially used in tables and the schema pointer

All datajoint tables part of the main framework will inherite from this table.

classmethod check_if_tuple_in_table(tuple_dict)[source]

Check if a given table has an entry that meets the tuple_dict condition. If exists, return True else False

Parameters:
table: datajoint table reference tuple_dict: dictionary containing the values of the table attributes
Returns:
bool: whether the table has the entry that meets tuple_dict or not
static compute_md5_hash(tuple_dict)[source]

Utility helper function to compute the md5 hash given the tuple_dict

Parameters:
tuple_dict (dict): dictionary to hash
Returns:
str: 128 byte md5 hash string
static import_class_from_module(module_name, class_name)[source]

Helper function to handle import of classes from modules

Parameters:
module_name (str): name of the module containing the target class to import model_class_name (str): name of class to import from module
Returns:
user_defined_class: class imported from module
classmethod insert_tuples(tuple_dicts)[source]

Insert a tuple by checking the highest id number based on “primary_key_id” When inserting, increment “primary_key_id” by 1

Parameters:
tuple_dict: either a dictionary or a list of dictionaries containing
the values of the class attributes
Returns:
None

synicix_ml_pipeline.datajoint_tables.DatasetConfig module

class synicix_ml_pipeline.datajoint_tables.DatasetConfig.DatasetConfig(dataset_dir=None, dataset_cache_dir=None)[source]

Bases: datajoint.user_tables.Manual, synicix_ml_pipeline.datajoint_tables.BaseTable.BaseTable

A dj.Manual table class that handle the storage dataset configs with details on what dataset class and params to load that dataset and dataloader with.

Initializing of this class will require the corresponding dataset_dir and dataset_cache_dir to be defined and passed into the __init__()

Typical usage of this class is done by using the method insert_tuples. An example of this can be found in the Pipeline Configuration Jupyter Notebook

Table definition:

dataset_config_id : int unsigned — dataset_file_name : varchar(256) dataset_type : varchar(256) dataset_class_module_name : varchar(256) dataset_class_name : varchar(256) dataset_class_params : longblob train_sampler_module_name : varchar(256) train_sampler_class_name : varchar(256) train_sampler_class_params : longblob validation_sampler_module_name : varchar(256) validation_sampler_class_name : varchar(256) validation_sampler_class_params : longblob test_sampler_module_name : longblob test_sampler_class_name : varchar(256) test_sampler_class_params : longblob input_shape : longblob output_shape : longblob additional_model_params : longblob dataset_config_blobs_md5_hash : char(128)
get_dataloaders(key, batch_size, num_workers=0)[source]

Method to build the dataloaders base off of the primary key and return it.

Parameters:
key (dict): DatasetConfig datajoint table priamry key. batch_size (int): Batchsize of the train, validation, test dataloaders classes. num_workers (int): Number of pytorch dataloader workers to use for the dataloaders.
Returns:
pytorch.dataloader, pytorch.dataloader, pytorch.dataloader: Train, validation, and test dataloaders
insert_tuples(tuple_dicts)[source]

Function to compute the hash, build the dataloader, get the input_shape, output_shape, and additional_model_params, and insert into DatasetConfig DJ table

Parameters:
tuple_dicts: A list of dict containing the attribute to be inserted in to DatasetConfig
Returns:
None

synicix_ml_pipeline.datajoint_tables.ModelConfig module

class synicix_ml_pipeline.datajoint_tables.ModelConfig.ModelConfig(arg=None)[source]

Bases: datajoint.user_tables.Manual, synicix_ml_pipeline.datajoint_tables.BaseTable.BaseTable

A dj.Manual table class that handle the storage of pytorch models definition along with some helper function to help load the models

Typical usage of this class is done by using the method insert_tuples. An example of this can be found in the Pipeline Configuration Jupyter Notebook

Table definition:

model_config_id : int unsigned # MD5 Hash of network_class_name + network_module_code — model_class_module_name : varchar(256) model_class_name : varchar(256) # Class name of the network model_class_params : longblob model_config_blobs_md5_hash : char(128)
classmethod get_model_class_and_params(key)[source]

Function to get the model and model_class params given a key

Parameters:
key (dict): A dictionary to restrict ModelConfig by down to one tuple
Returns:
<user_defined_model_class>, <user_defined_model_class_params>: Returns the model_class and model_class_params based on the key

synicix_ml_pipeline.datajoint_tables.TrainingConfig module

class synicix_ml_pipeline.datajoint_tables.TrainingConfig.TrainingConfig(arg=None)[source]

Bases: datajoint.user_tables.Manual, synicix_ml_pipeline.datajoint_tables.BaseTable.BaseTable

A dj.Manual table class that handle the storage of training configuration such the trainer class to use and various other parameters realted to that.

Typical usage of this class is done by using the method insert_tuples. An example of this can be found in the Pipeline Configuration Jupyter Notebook

Table definition:

training_config_id : int unsigned — trainer_class_module_name : varchar(256) trainer_class_name : varchar(256) trainer_class_params : longblob batch_size : smallint unsigned epoch_limit : int unsigned optimizer_class_module_name : varchar(256) optimizer_class_name : varchar(256) optimizer_class_params : longblob criterion_class_module_name : varchar(256) criterion_class_name : varchar(256) criterion_class_params : longblob training_config_blobs_md5_hash : char(128)
classmethod get_criterion_class_and_params(key)[source]

Function to load the criterion_class and criterion_class_params from the DB base on the key given

Parameters:
key (dict): key to restrict TrainingConfig by into one tuple
Returns:
<user_defined_criterion_class>, <user_defined_criterion_class_params>: Returns the criterion_class and criterion_class_params for the given key
classmethod get_optimizer_class_and_params(key)[source]

Function to load the optimizer_class and optimizer_class_params from the DB base on the key given

Parameters:
key (dict): key to restrict TrainingConfig by into one tuple
Returns:
<user_defined_optimizer_class>, <user_defined_optimizer_class_params>: Returns the optimzer_class and optimizer_class_params for the given key

synicix_ml_pipeline.datajoint_tables.TrainingResult module

synicix_ml_pipeline.datajoint_tables.TrainingTask module

class synicix_ml_pipeline.datajoint_tables.TrainingTask.TrainingTask(arg=None)[source]

Bases: datajoint.user_tables.Manual, synicix_ml_pipeline.datajoint_tables.BaseTable.BaseTable

A dj.Manual table class that handle the storage of training tasks which is a subset of all possiable combination of DatasetConfig, ModelConfig and DatasetConfig

Typical usage of this class is done by using the method insert_tuples. An example of this can be found in the Pipeline Configuration Jupyter Notebook

Table definition:

training_task_id : int unsigned — -> DatasetConfig -> ModelConfig -> TrainingConfig trial : smallint unsigned
classmethod insert_tuple(tuple_dicts, trials)[source]

Function to handle inserting Training Tasks by computing the md5_hash for each entry and inserting into the database

Parameters:
tuple_dict: A dict or a list of dicts containing the columns of the table defintion of TrainingTask
Returns:
None

Module contents