ETIA.CausalLearning.CDHPO.OCT package

Submodules

ETIA.CausalLearning.CDHPO.OCT.OCT module

class OCT(oct_params: Any, data: Any, results_folder: str, verbose=False)[source]

Bases: CDHPOBase

Class for performing Order-Based Causal Transfer (OCT) procedure.

Parameters:

oct_params (CDHPOParameters) – Object containing the parameters required for the OCT procedure.
data (Dataset) – Data to be used for the OCT procedure.
results_folder (str) – Path to the folder where results will be saved.
verbose (bool, optional) – If True, enables verbose logging. Default is False.

run()[source]: Executes the OCT procedure.

run_new()[source]: Continues the OCT procedure with new configurations.

find_best_config(algorithms)[source]: Finds the best configuration among specified algorithms.

save_progress()[source]: Saves the current state of the OCT object to a file.

load_progress(path)[source]: Loads the OCT object state from a file.

fold_fit(target, c, mec_graphs_configs, train_indexes, test_indexes, fold)[source]: Performs Markov boundary identification and predictive modeling for a specific fold.

nodes_parallel(target, c, mec_graphs_configs, train_indexes, test_indexes)[source]: Calculates the mutual information between the true values and predicted values of a target node in parallel.

config_parallel(c, mec_graphs_configs, train_indexes, test_indexes)[source]: Calculates the mutual information scores for all target nodes in parallel.

permutations(node, poolYhat_best, poolYhat_cur, idxs, poolY)[source]: Calculates the mutual information scores after swapping predictions between best and current configurations.

permutations_nodes(node, c)[source]: Performs permutations for a single node across all permutations.

calculate_pvalues(c)[source]: Calculates p-values to compare the current configuration with the best one.

save_progress()[source]: Saves the current state of the OCT object to a file.

static load_progress(path: str) → OCT[source]

Loads the OCT object state from a file.

Parameters:: path (str) – The file path to load the progress from.
Returns:: The loaded OCT object.
Return type:: OCT

fold_fit(target: int, c: int, mec_graphs_configs: List[Any], train_indexes: List[ndarray], test_indexes: List[ndarray], fold: int) → Tuple[ndarray, ndarray, ndarray][source]

Performs Markov boundary identification and predictive modeling for a specific fold of a target variable.

Parameters:

target (int) – Target variable index.
c (int) – Configuration index.
mec_graphs_configs (list) – MEC graphs configurations.
train_indexes (list) – List of training indices for each fold.
test_indexes (list) – List of testing indices for each fold.
fold (int) – Fold index.

Returns:

mb (np.ndarray) – Markov boundary indices.
prediction (np.ndarray) – Predicted values.
y_test (np.ndarray) – Actual target values for the test data.

nodes_parallel(target: int, c: int, mec_graphs_configs: List[Any], train_indexes: List[ndarray], test_indexes: List[ndarray]) → Tuple[float, List[ndarray], List[ndarray], List[ndarray]][source]

Calculates the mutual information between the true values and predicted values of a target node in parallel.

Parameters:

target (int) – Target variable index.
c (int) – Configuration index.
mec_graphs_configs (list) – MEC graphs configurations.
train_indexes (list) – List of training indices for each fold.
test_indexes (list) – List of testing indices for each fold.

Returns:

mu (float) – Mutual information score between the true values and predicted values.
mb_folds (list) – List of Markov boundaries for each fold.
pred_folds (list) – List of predictions for each fold.
y_test_folds (list) – List of true values for each fold.

config_parallel(c: int, mec_graphs_configs: List[Any], train_indexes: List[ndarray], test_indexes: List[ndarray]) → Tuple[ndarray, List[List[ndarray]], List[List[ndarray]], List[List[ndarray]]][source]

Calculates the mutual information scores for all target nodes in parallel.

Parameters:

c (int) – Configuration index.
mec_graphs_configs (list) – MEC graphs configurations.
train_indexes (list) – List of training indices for each fold.
test_indexes (list) – List of testing indices for each fold.

Returns:

mu_list (np.ndarray) – Array of mutual information scores for each target node.
mb_list (list) – List of Markov boundaries for each target node.
pred_list (list) – List of predictions for each target node.
y_test_list (list) – List of true values for each target node.

permutations(node: int, poolYhat_best: ndarray, poolYhat_cur: ndarray, idxs: ndarray, poolY: ndarray) → Tuple[float, float][source]

Calculates the mutual information scores after swapping predictions between best and current configurations.

Parameters:

node (int) – Node index.
poolYhat_best (np.ndarray) – Predictions from the best configuration.
poolYhat_cur (np.ndarray) – Predictions from the current configuration.
idxs (np.ndarray) – Indices for permutation.
poolY (np.ndarray) – Actual target values.

Returns:

x (float) – Mutual information score for the best configuration after swap.
y (float) – Mutual information score for the current configuration after swap.

permutations_nodes(node: int, c: int) → Tuple[ndarray, ndarray][source]

Performs permutations for a single node across all permutations.

Parameters:

node (int) – Node index.
c (int) – Configuration index.

Returns:

swap_best_metric (np.ndarray) – Array of mutual information scores for the best configuration after swaps.
swap_cur_metric (np.ndarray) – Array of mutual information scores for the current configuration after swaps.

calculate_pvalues(c: int)[source]

Calculates p-values to compare the current configuration with the best one.

Parameters:: c (int) – Configuration index.

run() → Tuple[Dict[str, Any], ndarray, Any][source]

Executes the OCT procedure.

Returns:

opt_config (dict) – The optimal configuration found.
matrix_mec_graph (np.ndarray) – The MEC graph matrix of the optimal configuration.
matrix_graph (nd.nd.array) – The graph matrix of optimal configuration
library_results (Any) – Results from the causal discovery library.

run_new() → Tuple[Dict[str, Any], ndarray, Any][source]

Continues the OCT procedure with new configurations.

Returns:

opt_config (dict) – The optimal configuration found.
matrix_mec_graph (np.ndarray) – The MEC graph matrix of the optimal configuration.
library_results (Any) – Results from the causal discovery library.

find_best_config(algorithms: List[str]) → Tuple[Dict[str, Any], ndarray, Any][source]

Finds the best configuration among specified algorithms.

Parameters:

algorithms (list) – List of algorithm names to consider.

Returns:

best_config (dict) – The best configuration among the specified algorithms.
matrix_mec_graph (np.ndarray) – The MEC graph matrix of the best configuration.
library_results (Any) – Results from the causal discovery library.

Raises:

RuntimeError – If no configurations have been run for the specified algorithms.

ETIA.CausalLearning.CDHPO.OCT.utils module

is_dict_in_array(dictionary, array)[source]

Check if a dictionary is already in an array of dictionaries.

Parameters:

dictionary (dict) – the dictionary to check
array (list) – the array of dictionaries to check

Returns:

True if the dictionary is in the array, False otherwise

Return type:

bool

mutual_info_continuous(y, y_hat)[source]

Computes the mutual information between two continuous variables, assuming Gaussian distribution :param y: vector of true values :type y: numpy array :param y_hat: vector of predicted values :type y_hat: numpy array

Returns:: mutual information of y and y_hat
Return type:: mutual_info (float)