ETIA.CausalLearning.CDHPO.OCT package

Submodules

ETIA.CausalLearning.CDHPO.OCT.OCT module

class OCT(oct_params: Any, data: Any, results_folder: str, verbose=False)[source]

Bases: CDHPOBase

Class for performing Order-Based Causal Transfer (OCT) procedure.

Parameters:
  • oct_params (CDHPOParameters) – Object containing the parameters required for the OCT procedure.

  • data (Dataset) – Data to be used for the OCT procedure.

  • results_folder (str) – Path to the folder where results will be saved.

  • verbose (bool, optional) – If True, enables verbose logging. Default is False.

run()[source]

Executes the OCT procedure.

run_new()[source]

Continues the OCT procedure with new configurations.

find_best_config(algorithms)[source]

Finds the best configuration among specified algorithms.

save_progress()[source]

Saves the current state of the OCT object to a file.

load_progress(path)[source]

Loads the OCT object state from a file.

fold_fit(target, c, mec_graphs_configs, train_indexes, test_indexes, fold)[source]

Performs Markov boundary identification and predictive modeling for a specific fold.

nodes_parallel(target, c, mec_graphs_configs, train_indexes, test_indexes)[source]

Calculates the mutual information between the true values and predicted values of a target node in parallel.

config_parallel(c, mec_graphs_configs, train_indexes, test_indexes)[source]

Calculates the mutual information scores for all target nodes in parallel.

permutations(node, poolYhat_best, poolYhat_cur, idxs, poolY)[source]

Calculates the mutual information scores after swapping predictions between best and current configurations.

permutations_nodes(node, c)[source]

Performs permutations for a single node across all permutations.

calculate_pvalues(c)[source]

Calculates p-values to compare the current configuration with the best one.

save_progress()[source]

Saves the current state of the OCT object to a file.

static load_progress(path: str) OCT[source]

Loads the OCT object state from a file.

Parameters:

path (str) – The file path to load the progress from.

Returns:

The loaded OCT object.

Return type:

OCT

fold_fit(target: int, c: int, mec_graphs_configs: List[Any], train_indexes: List[ndarray], test_indexes: List[ndarray], fold: int) Tuple[ndarray, ndarray, ndarray][source]

Performs Markov boundary identification and predictive modeling for a specific fold of a target variable.

Parameters:
  • target (int) – Target variable index.

  • c (int) – Configuration index.

  • mec_graphs_configs (list) – MEC graphs configurations.

  • train_indexes (list) – List of training indices for each fold.

  • test_indexes (list) – List of testing indices for each fold.

  • fold (int) – Fold index.

Returns:

  • mb (np.ndarray) – Markov boundary indices.

  • prediction (np.ndarray) – Predicted values.

  • y_test (np.ndarray) – Actual target values for the test data.

nodes_parallel(target: int, c: int, mec_graphs_configs: List[Any], train_indexes: List[ndarray], test_indexes: List[ndarray]) Tuple[float, List[ndarray], List[ndarray], List[ndarray]][source]

Calculates the mutual information between the true values and predicted values of a target node in parallel.

Parameters:
  • target (int) – Target variable index.

  • c (int) – Configuration index.

  • mec_graphs_configs (list) – MEC graphs configurations.

  • train_indexes (list) – List of training indices for each fold.

  • test_indexes (list) – List of testing indices for each fold.

Returns:

  • mu (float) – Mutual information score between the true values and predicted values.

  • mb_folds (list) – List of Markov boundaries for each fold.

  • pred_folds (list) – List of predictions for each fold.

  • y_test_folds (list) – List of true values for each fold.

config_parallel(c: int, mec_graphs_configs: List[Any], train_indexes: List[ndarray], test_indexes: List[ndarray]) Tuple[ndarray, List[List[ndarray]], List[List[ndarray]], List[List[ndarray]]][source]

Calculates the mutual information scores for all target nodes in parallel.

Parameters:
  • c (int) – Configuration index.

  • mec_graphs_configs (list) – MEC graphs configurations.

  • train_indexes (list) – List of training indices for each fold.

  • test_indexes (list) – List of testing indices for each fold.

Returns:

  • mu_list (np.ndarray) – Array of mutual information scores for each target node.

  • mb_list (list) – List of Markov boundaries for each target node.

  • pred_list (list) – List of predictions for each target node.

  • y_test_list (list) – List of true values for each target node.

permutations(node: int, poolYhat_best: ndarray, poolYhat_cur: ndarray, idxs: ndarray, poolY: ndarray) Tuple[float, float][source]

Calculates the mutual information scores after swapping predictions between best and current configurations.

Parameters:
  • node (int) – Node index.

  • poolYhat_best (np.ndarray) – Predictions from the best configuration.

  • poolYhat_cur (np.ndarray) – Predictions from the current configuration.

  • idxs (np.ndarray) – Indices for permutation.

  • poolY (np.ndarray) – Actual target values.

Returns:

  • x (float) – Mutual information score for the best configuration after swap.

  • y (float) – Mutual information score for the current configuration after swap.

permutations_nodes(node: int, c: int) Tuple[ndarray, ndarray][source]

Performs permutations for a single node across all permutations.

Parameters:
  • node (int) – Node index.

  • c (int) – Configuration index.

Returns:

  • swap_best_metric (np.ndarray) – Array of mutual information scores for the best configuration after swaps.

  • swap_cur_metric (np.ndarray) – Array of mutual information scores for the current configuration after swaps.

calculate_pvalues(c: int)[source]

Calculates p-values to compare the current configuration with the best one.

Parameters:

c (int) – Configuration index.

run() Tuple[Dict[str, Any], ndarray, Any][source]

Executes the OCT procedure.

Returns:

  • opt_config (dict) – The optimal configuration found.

  • matrix_mec_graph (np.ndarray) – The MEC graph matrix of the optimal configuration.

  • matrix_graph (nd.nd.array) – The graph matrix of optimal configuration

  • library_results (Any) – Results from the causal discovery library.

run_new() Tuple[Dict[str, Any], ndarray, Any][source]

Continues the OCT procedure with new configurations.

Returns:

  • opt_config (dict) – The optimal configuration found.

  • matrix_mec_graph (np.ndarray) – The MEC graph matrix of the optimal configuration.

  • library_results (Any) – Results from the causal discovery library.

find_best_config(algorithms: List[str]) Tuple[Dict[str, Any], ndarray, Any][source]

Finds the best configuration among specified algorithms.

Parameters:

algorithms (list) – List of algorithm names to consider.

Returns:

  • best_config (dict) – The best configuration among the specified algorithms.

  • matrix_mec_graph (np.ndarray) – The MEC graph matrix of the best configuration.

  • library_results (Any) – Results from the causal discovery library.

Raises:

RuntimeError – If no configurations have been run for the specified algorithms.

ETIA.CausalLearning.CDHPO.OCT.utils module

is_dict_in_array(dictionary, array)[source]

Check if a dictionary is already in an array of dictionaries.

Parameters:
  • dictionary (dict) – the dictionary to check

  • array (list) – the array of dictionaries to check

Returns:

True if the dictionary is in the array, False otherwise

Return type:

bool

mutual_info_continuous(y, y_hat)[source]

Computes the mutual information between two continuous variables, assuming Gaussian distribution :param y: vector of true values :type y: numpy array :param y_hat: vector of predicted values :type y_hat: numpy array

Returns:

mutual information of y and y_hat

Return type:

mutual_info (float)