ETIA.CausalLearning.model_validation_protocols.kfold package

Submodules

ETIA.CausalLearning.model_validation_protocols.kfold.kfold module

class KFoldCV[source]

Bases: MVP_ProtocolBase

Class implementing a K-Fold Cross-Validation protocol for running a causal discovery algorithm.

folds

Number of folds to be used in the cross-validation. Default is 10.

Type:

int

folds_to_run

Number of folds to run the cross-validation for. Default is 1.

Type:

int

train_indexes

A list of indexes for the training samples.

Type:

list of int

test_indexes

A list of indexes for the test samples.

Type:

list of int

data_train

A list of training data samples for each fold.

Type:

list of pd.DataFrame

data_test

A list of test data samples for each fold.

Type:

list of pd.DataFrame

set_params(parameters, verbose=False)[source]

Set the number of folds and the number of folds to run the protocol for.

run_cd_algorithm(data, algorithm, parameters, fold)[source]

Run the causal discovery algorithm on the specified fold.

init_protocol(data)[source]

Initialize the K-Fold protocol.

run_protocol(data, algorithm, parameters, n_jobs=1)[source]

Run the K-Fold cross-validation protocol.

set_params(parameters, verbose=False)[source]

Set the number of folds and the number of folds to run the protocol for.

Parameters:
  • parameters (dict) – A dictionary of parameters, including the number of folds and the number of folds to run.

  • verbose (bool, optional) – If True, enables detailed logging. Default is False.

run_cd_algorithm(data, algorithm, parameters, fold)[source]

Run the causal discovery algorithm on the specified fold.

Parameters:
  • data (pd.DataFrame) – The dataset on which to run the causal discovery algorithm.

  • algorithm (object) – The causal discovery algorithm to be used.

  • parameters (dict) – A dictionary of parameters to pass to the algorithm.

  • fold (int) – The current fold number for which to run the algorithm.

Returns:

A list containing the MEC graph and library results produced by the causal discovery algorithm.

Return type:

list of np.ndarray

init_protocol(data)[source]

Initialize the K-Fold protocol by splitting the data into training and test sets for each fold.

Parameters:

data (pd.DataFrame) – The dataset to be used for the cross-validation.

run_protocol(data, algorithm, parameters, n_jobs=1)[source]

Run the K-Fold cross-validation protocol with the specified causal discovery algorithm.

Parameters:
  • data (pd.DataFrame) – The dataset on which to run the algorithm.

  • algorithm (object) – The causal discovery algorithm to use.

  • parameters (dict) – A dictionary of parameters to be passed to the algorithm.

  • n_jobs (int, optional) – The number of CPU cores to use for parallel computation. Default is 1.

Returns:

A list containing the results of the protocol, with the MEC graphs and other results.

Return type:

list of np.ndarray