ETIA.CausalLearning.CDHPO.OCT package
Submodules
ETIA.CausalLearning.CDHPO.OCT.OCT module
- class OCT(oct_params: Any, data: Any, results_folder: str, verbose=False)[source]
Bases:
CDHPOBaseClass for performing Order-Based Causal Transfer (OCT) procedure.
- Parameters:
oct_params (CDHPOParameters) – Object containing the parameters required for the OCT procedure.
data (Dataset) – Data to be used for the OCT procedure.
results_folder (str) – Path to the folder where results will be saved.
verbose (bool, optional) – If True, enables verbose logging. Default is False.
- fold_fit(target, c, mec_graphs_configs, train_indexes, test_indexes, fold)[source]
Performs Markov boundary identification and predictive modeling for a specific fold.
- nodes_parallel(target, c, mec_graphs_configs, train_indexes, test_indexes)[source]
Calculates the mutual information between the true values and predicted values of a target node in parallel.
- config_parallel(c, mec_graphs_configs, train_indexes, test_indexes)[source]
Calculates the mutual information scores for all target nodes in parallel.
- permutations(node, poolYhat_best, poolYhat_cur, idxs, poolY)[source]
Calculates the mutual information scores after swapping predictions between best and current configurations.
- permutations_nodes(node, c)[source]
Performs permutations for a single node across all permutations.
- calculate_pvalues(c)[source]
Calculates p-values to compare the current configuration with the best one.
- static load_progress(path: str) OCT[source]
Loads the OCT object state from a file.
- Parameters:
path (str) – The file path to load the progress from.
- Returns:
The loaded OCT object.
- Return type:
- fold_fit(target: int, c: int, mec_graphs_configs: List[Any], train_indexes: List[ndarray], test_indexes: List[ndarray], fold: int) Tuple[ndarray, ndarray, ndarray][source]
Performs Markov boundary identification and predictive modeling for a specific fold of a target variable.
- Parameters:
target (int) – Target variable index.
c (int) – Configuration index.
mec_graphs_configs (list) – MEC graphs configurations.
train_indexes (list) – List of training indices for each fold.
test_indexes (list) – List of testing indices for each fold.
fold (int) – Fold index.
- Returns:
mb (np.ndarray) – Markov boundary indices.
prediction (np.ndarray) – Predicted values.
y_test (np.ndarray) – Actual target values for the test data.
- nodes_parallel(target: int, c: int, mec_graphs_configs: List[Any], train_indexes: List[ndarray], test_indexes: List[ndarray]) Tuple[float, List[ndarray], List[ndarray], List[ndarray]][source]
Calculates the mutual information between the true values and predicted values of a target node in parallel.
- Parameters:
target (int) – Target variable index.
c (int) – Configuration index.
mec_graphs_configs (list) – MEC graphs configurations.
train_indexes (list) – List of training indices for each fold.
test_indexes (list) – List of testing indices for each fold.
- Returns:
mu (float) – Mutual information score between the true values and predicted values.
mb_folds (list) – List of Markov boundaries for each fold.
pred_folds (list) – List of predictions for each fold.
y_test_folds (list) – List of true values for each fold.
- config_parallel(c: int, mec_graphs_configs: List[Any], train_indexes: List[ndarray], test_indexes: List[ndarray]) Tuple[ndarray, List[List[ndarray]], List[List[ndarray]], List[List[ndarray]]][source]
Calculates the mutual information scores for all target nodes in parallel.
- Parameters:
c (int) – Configuration index.
mec_graphs_configs (list) – MEC graphs configurations.
train_indexes (list) – List of training indices for each fold.
test_indexes (list) – List of testing indices for each fold.
- Returns:
mu_list (np.ndarray) – Array of mutual information scores for each target node.
mb_list (list) – List of Markov boundaries for each target node.
pred_list (list) – List of predictions for each target node.
y_test_list (list) – List of true values for each target node.
- permutations(node: int, poolYhat_best: ndarray, poolYhat_cur: ndarray, idxs: ndarray, poolY: ndarray) Tuple[float, float][source]
Calculates the mutual information scores after swapping predictions between best and current configurations.
- Parameters:
node (int) – Node index.
poolYhat_best (np.ndarray) – Predictions from the best configuration.
poolYhat_cur (np.ndarray) – Predictions from the current configuration.
idxs (np.ndarray) – Indices for permutation.
poolY (np.ndarray) – Actual target values.
- Returns:
x (float) – Mutual information score for the best configuration after swap.
y (float) – Mutual information score for the current configuration after swap.
- permutations_nodes(node: int, c: int) Tuple[ndarray, ndarray][source]
Performs permutations for a single node across all permutations.
- Parameters:
node (int) – Node index.
c (int) – Configuration index.
- Returns:
swap_best_metric (np.ndarray) – Array of mutual information scores for the best configuration after swaps.
swap_cur_metric (np.ndarray) – Array of mutual information scores for the current configuration after swaps.
- calculate_pvalues(c: int)[source]
Calculates p-values to compare the current configuration with the best one.
- Parameters:
c (int) – Configuration index.
- run() Tuple[Dict[str, Any], ndarray, Any][source]
Executes the OCT procedure.
- Returns:
opt_config (dict) – The optimal configuration found.
matrix_mec_graph (np.ndarray) – The MEC graph matrix of the optimal configuration.
matrix_graph (nd.nd.array) – The graph matrix of optimal configuration
library_results (Any) – Results from the causal discovery library.
- run_new() Tuple[Dict[str, Any], ndarray, Any][source]
Continues the OCT procedure with new configurations.
- Returns:
opt_config (dict) – The optimal configuration found.
matrix_mec_graph (np.ndarray) – The MEC graph matrix of the optimal configuration.
library_results (Any) – Results from the causal discovery library.
- find_best_config(algorithms: List[str]) Tuple[Dict[str, Any], ndarray, Any][source]
Finds the best configuration among specified algorithms.
- Parameters:
algorithms (list) – List of algorithm names to consider.
- Returns:
best_config (dict) – The best configuration among the specified algorithms.
matrix_mec_graph (np.ndarray) – The MEC graph matrix of the best configuration.
library_results (Any) – Results from the causal discovery library.
- Raises:
RuntimeError – If no configurations have been run for the specified algorithms.
ETIA.CausalLearning.CDHPO.OCT.utils module
- is_dict_in_array(dictionary, array)[source]
Check if a dictionary is already in an array of dictionaries.
- Parameters:
dictionary (dict) – the dictionary to check
array (list) – the array of dictionaries to check
- Returns:
True if the dictionary is in the array, False otherwise
- Return type:
bool
- mutual_info_continuous(y, y_hat)[source]
Computes the mutual information between two continuous variables, assuming Gaussian distribution :param y: vector of true values :type y: numpy array :param y_hat: vector of predicted values :type y_hat: numpy array
- Returns:
mutual information of y and y_hat
- Return type:
mutual_info (float)