ETIA.CausalLearning.model_validation_protocols.kfold package
Submodules
ETIA.CausalLearning.model_validation_protocols.kfold.kfold module
- class KFoldCV[source]
Bases:
MVP_ProtocolBaseClass implementing a K-Fold Cross-Validation protocol for running a causal discovery algorithm.
- folds
Number of folds to be used in the cross-validation. Default is 10.
- Type:
int
- folds_to_run
Number of folds to run the cross-validation for. Default is 1.
- Type:
int
- train_indexes
A list of indexes for the training samples.
- Type:
list of int
- test_indexes
A list of indexes for the test samples.
- Type:
list of int
- data_train
A list of training data samples for each fold.
- Type:
list of pd.DataFrame
- data_test
A list of test data samples for each fold.
- Type:
list of pd.DataFrame
- set_params(parameters, verbose=False)[source]
Set the number of folds and the number of folds to run the protocol for.
- run_cd_algorithm(data, algorithm, parameters, fold)[source]
Run the causal discovery algorithm on the specified fold.
- run_protocol(data, algorithm, parameters, n_jobs=1)[source]
Run the K-Fold cross-validation protocol.
- set_params(parameters, verbose=False)[source]
Set the number of folds and the number of folds to run the protocol for.
- Parameters:
parameters (dict) – A dictionary of parameters, including the number of folds and the number of folds to run.
verbose (bool, optional) – If True, enables detailed logging. Default is False.
- run_cd_algorithm(data, algorithm, parameters, fold)[source]
Run the causal discovery algorithm on the specified fold.
- Parameters:
data (pd.DataFrame) – The dataset on which to run the causal discovery algorithm.
algorithm (object) – The causal discovery algorithm to be used.
parameters (dict) – A dictionary of parameters to pass to the algorithm.
fold (int) – The current fold number for which to run the algorithm.
- Returns:
A list containing the MEC graph and library results produced by the causal discovery algorithm.
- Return type:
list of np.ndarray
- init_protocol(data)[source]
Initialize the K-Fold protocol by splitting the data into training and test sets for each fold.
- Parameters:
data (pd.DataFrame) – The dataset to be used for the cross-validation.
- run_protocol(data, algorithm, parameters, n_jobs=1)[source]
Run the K-Fold cross-validation protocol with the specified causal discovery algorithm.
- Parameters:
data (pd.DataFrame) – The dataset on which to run the algorithm.
algorithm (object) – The causal discovery algorithm to use.
parameters (dict) – A dictionary of parameters to be passed to the algorithm.
n_jobs (int, optional) – The number of CPU cores to use for parallel computation. Default is 1.
- Returns:
A list containing the results of the protocol, with the MEC graphs and other results.
- Return type:
list of np.ndarray