ETIA.CRV.causal_graph_utils package
Submodules
ETIA.CRV.causal_graph_utils.bidirected_path module
- bidirected_path(i, matrix)[source]
Recursive function to find the nodes that are reachable in any bidirected path starting from the node i Author : kbiza@csd.uoc.gr :param i: the starting node (not a list of integer!!) :type i: int :param matrix: matrix of size N*N where N is the number of nodes in tetrad_graph
matrix(i, j) = 2 and matrix(j, i) = 3: i–>j matrix(i, j) = 1 and matrix(j, i) = 1: io-oj matrix(i, j) = 2 and matrix(j, i) = 2: i<->j matrix(i, j) = 3 and matrix(j, i) = 3: i—j matrix(i, j) = 2 and matrix(j, i) = 1: io->j
- Returns:
the nodes that are reachable in any bidirected path starting from node i
- Return type:
list_nodes (list)
ETIA.CRV.causal_graph_utils.confidence_causal_findings module
- is_consistent_edge(m1_ij, m1_ji, m2_ij, m2_ji)[source]
Checks if two edges are consistent :param m1_ij: notation of matrix1[i,j] :type m1_ij: int :param m1_ji: notation of matrix1[j,i] :type m1_ji: int :param m2_ij: notation of matrix2[i,j] :type m2_ij: int :param m2_ji: notation of matrix2[j,i] :type m2_ji: int
- Returns:
True or False
- Return type:
is consistent(bool)
- compute_edge_weights(best_mec_matrix, bootstrapped_mec_matrices, all_edges=True, true_graph=None)[source]
Compute edge consistency and edge frequency for each edge :param best_mec_matrix: :param bootstrapped_mec_matrices(list): :param all_edges(bool): if False it evaluates only the edges that appear in best_mec_matrix :type all_edges(bool): if True it checks all possible edges n(n-1)/2 and evaluates missing edges :param true_graph:
- paths_metrics(best_mec_matrix, bootstrapped_mec_matrices, paths)[source]
Compute path consistency and path discovery for each path :param paths(dictionary): :type paths(dictionary): dictionary with lists of paths :param bootstrapped_graphs(list): :type bootstrapped_graphs(list): bootstrapped graphs :param opt_graph(pandas Dataframe): :type opt_graph(pandas Dataframe): adjacency matrix of graph
- Returns:
path_consistency(dictionary) (consistency values based on the input paths dictionary)
path_discovery(dictionary) (discovery values based on the input paths dictionary)
ETIA.CRV.causal_graph_utils.cpdag_to_dag module
- cpdag_to_dag(cpdag_pd, verbose, n_lags=None)[source]
Converts CPDAG to DAG :param cpdag_pd (pandas Dataframe): :type cpdag_pd (pandas Dataframe): the matrix of the CPDAG :param verbose (bool): :param n_lags (int): :type n_lags (int): the maximum number of previous time lags in case of time-lagged graphs
- Returns:
dag_pd (pandas Dataframe)
- Return type:
the matrix of the DAG
ETIA.CRV.causal_graph_utils.create_sub_mag_pag module
- create_sub_mag_pag(dag_pd, selected_vars, n_lags=None)[source]
Given a DAG and a set of latent variables, we marginalize out the latent variables and create the corresponding MAG and PAG. For time-lagged causal DAGs, we enforce the stationarity assumption.
- Parameters:
Dataframe) (dag_pd (pandas)
(list) (selected_vars)
int) (n_lags (None or)
- Returns:
mag_noL_pd (pandas Dataframe) (the matrix of the MAG (the latent variables are removed))
pag_noL_pd (pandas Dataframe) (the matrix of the PAG (the latent variables are removed))
ETIA.CRV.causal_graph_utils.dag_to_cpdag module
- dag_to_cpdag(dag_pd, verbose, n_lags=None)[source]
Converts DAG to CPDAG :param dag_pd (pandas Dataframe): :type dag_pd (pandas Dataframe): the matrix of the DAG :param verbose (bool): :param n_lags(int or None): if int, the dag_pd must be a time-lagged graph :type n_lags(int or None): the maximum number of previous time lags
- Returns:
cpdag_pd (pandas Dataframe)
- Return type:
the matrix of the CPDAG
ETIA.CRV.causal_graph_utils.dag_to_mag_removeL module
- dag_to_mag_removeL(dag_pd, is_latent)[source]
Converts a DAG into a MAG after marginalizing out latent variables Author : kbiza@csd.uoc.gr based on matlab code by striant@csd.uoc.gr
- Parameters:
dag_pd (pandas Dataframe) – the DAG matrix dag(i, j) = 2 and dag(j, i) = 3: i–>j
is_latent (numpy vector) – True if variable will be marginalized out
- Returns:
- the MAG matrix
mag(i, j) = 2 and mag(j, i) = 3: i–>j mag(i, j) = 2 and mag(j, i) = 2: i<->j mag(i, j) = 2 and mag(j, i) = 1: io->j
- mag_removeL_pd (pandas Dataframe)the MAG matrix where we drop the columns and rows
that correspond to the latent variables
- Return type:
mag_pd (pandas Dataframe)
ETIA.CRV.causal_graph_utils.data_functions module
- get_data_type(data_pd)[source]
Returns the type of each variable in the dataset (continuous or categorical) :param data_pd(pandas dataframe): :type data_pd(pandas dataframe): the dataset
- Returns:
data_type_info(pandas dataframe) –
‘var_type’ for the type of the variable
’n_domain’ for the number of categories in case of categorical variable
each row corresponds to a variable
- Return type:
with two columns,
- apply_ordinal_encoding(data_pd, data_type_info)[source]
Applies ordinal encoding with sklearn :param data_pd(pandas dataframe): :type data_pd(pandas dataframe): the dataset :param data_type_info(pandas dataframe): :type data_type_info(pandas dataframe): information for the type of each variable (continuous or categorical)
- Returns:
data_pd(pandas dataframe)
- Return type:
the transformed dataset
- timeseries_to_timelagged(data_pd, n_lags, window=False)[source]
Converts time-series data to time-lagged data :param data_pd (pandas dataframe): e.g. V1, V2 :type data_pd (pandas dataframe): time-series dataset :param n_lags(int): :type n_lags(int): number fo previous lags :param window(bool): :type window(bool): True for non-overlapped windows
- Returns:
data_pd_tl(pandas dataframe) – e.g. V1, V2, V1:1, V2:1
- Return type:
time-lagged dataset
- timelagged_to_timeseries(data_pd, n_lags)[source]
Converts time-lagged data to time-series data :param data_pd(pandas dataframe): e.g. V1, V2, V1:1, V2:1 :type data_pd(pandas dataframe): time-lagged dataset :param n_lags(int): :type n_lags(int): number fo previous lags
- Returns:
ts_data(pandas dataframe) – e.g. V1, V2
- Return type:
time-series dataset
- transform_data(data_pd, data_type_info, transform_type)[source]
Data transfomation with sklearn :param data_pd(pandas dataframe): :type data_pd(pandas dataframe): dataset :param data_type_info(pandas dataframe): :param transform_type(str): :type transform_type(str): {qgaussian, log, minmax, standardize}
- Return type:
transformed_data(pandas dataframe)
- names_from_lag(varnames_lag)[source]
- Parameters:
varnames_lag(list) (the variable names with information about time-lag) – e.g [‘V1’, ‘V2’, ‘V1:1’, ‘V2:1’]
- Returns:
varnames(list) – e.g. [‘V1’,’V2’]
- Return type:
the variable names without lag info
- lagnames_from_names(varnames, n_lags)[source]
- Parameters:
varnames(list) (the variable names without lag info) – e.g. [‘V1’,’V2’]
n_lags(int) (the maximum number of previous time lags)
- Returns:
varnames_lag(list) – e.g [‘V1’, ‘V2’, ‘V1:1’, ‘V2:1’]
- Return type:
the variable names with information about time-lag
ETIA.CRV.causal_graph_utils.enforce_stationarity module
- enforce_stationarity_arrowheads(G, graph_pd, n_lags, verbose)[source]
# Adds arrowheads on edges that end to future time lags, e.g. Xt-1 *–> Xt :param G(numpy array): :type G(numpy array): the matrix of the time-lagged graph to change :param graph_pd(pandas Dataframe): :type graph_pd(pandas Dataframe): the original matrix of the time-lagged graph :param n_lags(int): :type n_lags(int): the maximum number of previous lags :param verbose (bool):
- Returns:
G(numpy array)
- Return type:
the changed matrix of the time-lagged graph
- enforce_stationarity_tails_and_orientation(G, graph_pd, n_lags, verbose)[source]
- Adds tails on the edges that start from the oldest time lag
- e.g. for n_lags=2, if X2_t-1 —> X2_t and X2_t-2 o–> X2_t-1
we set X2_t-2 —> X2_t-1
It also enforces stationarity inside each time lag regarding the orientation of existing edges :param G(numpy array): :type G(numpy array): the matrix of the graph :param mag_pd(pandas Dataframe): :param n_lags (int): :type n_lags (int): the maximum number of previous lags :param verbose (bool):
- Returns:
G(numpy array)
- Return type:
the matrix of the graph
- enforce_stationarity_add_edge(G, mag_pd, n_lags, verbose)[source]
- Enforces stationarity assumption on the time-lagged graph
If At –> Bt then A_t-1 –> B_t-1 (add edge between nodes in the same time lag) If At-1 –> B_t then A_t-2 –> B_t-1 (add egde between nodes across time lags)
- Parameters:
array) (G(numpy)
Dataframe) (mag_pd(pandas)
(int) (n_lags)
(bool) (verbose)
- Returns:
G(numpy array)
- Return type:
the matrix of the graph
ETIA.CRV.causal_graph_utils.find_ancestors_nx module
- find_ancestors_nx(graph, node=None)[source]
A is an ancestor of B if graph(i,j)=2 and graph(j,i)=3 for every edge i–>j in the path from A to B Author : kbiza@csd.uoc.gr :param graph: matrix of the causal graph
graph(i, j) = 2 and graph(j, i) = 3: i–>j graph(i, j) = 2 and graph(j, i) = 2: i<->j graph(i, j) = 2 and graph(j, i) = 1: io->j
- Parameters:
node (int) – the node of interest to find its ancestors if None it returns the ancestors of all nodes
- Returns:
(list) : if a node is given it returns the indexes of its ancestors (numpy array): if no node is given it finds the ancestors of all nodes
and returns logical matrix
- Return type:
is_ancestor
Note: the node under study is not in the set of its ancestors
ETIA.CRV.causal_graph_utils.get_unshielded_triples module
- get_unshielded_triples(G)[source]
Find the unshielded triples of each node in a graph G Author : kbiza@csd.uoc.gr based on matlab code by striant@csd.uoc.gr :param G(numpy array): :type G(numpy array): the matrix of the graph
- Returns:
unshielded_triples(dictionary of list) – each value contains two lists with the matrix coordinates (x,y) of the unshielded triples
- Return type:
each key corresponds to a node and
ETIA.CRV.causal_graph_utils.has_inducing_path_dag module
- has_inducing_path_dag(X, Y, dag, is_ancestor, is_latent, verbose=False)[source]
Checks if nodes X and Y are connected in the dag with an inducing path wrt a set of latent variables L.
- A path is inducing relative to a set of nodes L if (Borbudakis et al 2012):
every non-endpoint vertex on p is either in L or a collider
AND - every collider on p is an ancestor of an end-point vertex of the path
Author: kbiza@csd.uoc.gr, based on matlab code by striant@csd.uoc.gr
- Parameters:
X (int) – the node X
Y (int) – the node Y
dag (numpy array) – the matrix of the DAG dag(i, j) = 2 and dag(j, i) = 3: i–>j
is_ancestor (numpy array) – boolean array is_ancestor(i,j)=True if i is ancestor of j in a dag
is_latent (numpy vector) – boolean is_latent[i]=True if i is latent variable
verbose (bool) – print if True
- Returns:
True if X and Y are connected in the DAG with an inducing path
- Return type:
has_ind_path (bool)
ETIA.CRV.causal_graph_utils.is_collider module
ETIA.CRV.causal_graph_utils.is_dag module
ETIA.CRV.causal_graph_utils.mag_to_pag module
- FCI_rules_mag(G, mag, verbose)[source]
Applies the FCI rules on the given graph :param G(numpy matrix): :type G(numpy matrix): the matrix of the graph :param mag(numpy matrix): :type mag(numpy matrix): the matrix of the mag :param verbose (bool):
- Returns:
G(numpy matrix) (the matrix of the graph)
dnc (dictionary)
flagcount (int)
- mag_to_pag(mag_pd, verbose, n_lags=None)[source]
Converts MAG to PAG :param mag_pd (pandas Dataframe): :type mag_pd (pandas Dataframe): the matrix of the MAG :param verbose (bool): :param n_lags (int): :type n_lags (int): the maximum number of previous time lags in case of time-lagged graphs
- Returns:
pag_pd (pandas Dataframe)
- Return type:
the matrix of the PAG
ETIA.CRV.causal_graph_utils.markov_boundary module
- markov_boundary(target, matrix)[source]
Identify the markov boundary of the target node. Function for DAGs and MAGs Author:kbiza@csd.uoc.gr :param target: index of the target node in the matrix (not a list of int!!) :type target: int :param matrix: an array of size N*N where N is the number of nodes in tetrad_graph
matrix(i, j) = 2 and matrix(j, i) = 3: i–>j in DAGs and MAGs matrix(i, j) = 2 and matrix(j, i) = 2: i<->j in MAGs
- Returns:
list of indexes for the markov boundary ot the target
- Return type:
markov_boundary (list)
ETIA.CRV.causal_graph_utils.one_bidirected_path module
- one_bidirected_path_from_to(matrix, start, end, path_=[])[source]
Recursive function to search for at least one bidirected path between ‘start’ node and ‘end’ node Author : kbiza@csd.uoc.gr
- Args:
- matrix(numpy array)matrix of size N*N where N is the number of nodes in tetrad_graph
matrix(i, j) = 2 and matrix(j, i) = 3: i–>j matrix(i, j) = 1 and matrix(j, i) = 1: io-oj matrix(i, j) = 2 and matrix(j, i) = 2: i<->j matrix(i, j) = 3 and matrix(j, i) = 3: i—j matrix(i, j) = 2 and matrix(j, i) = 1: io->j
start (int): the first node in the path end (int): the last node in the path path_ (list): only needed for the recursive call (the path under search)
- Returns:
path(list) : a list of nodes we visit from start node to end node in a bidirected path
ETIA.CRV.causal_graph_utils.one_directed_path module
- one_directed_path(matrix, start, end, path_=[])[source]
Recursive function to search for at least one directed path from ‘start’ node to ‘end’ node Author : kbiza@csd.uoc.gr
Args: matrix(numpy array): matrix of size N*N where N is the number of nodes in tetrad_graph
matrix(i, j) = 2 and matrix(j, i) = 3: i–>j matrix(i, j) = 1 and matrix(j, i) = 1: io-oj matrix(i, j) = 2 and matrix(j, i) = 2: i<->j matrix(i, j) = 3 and matrix(j, i) = 3: i—j matrix(i, j) = 2 and matrix(j, i) = 1: io->j
start(int): the first node in the path end(int): the last node in the path path_ (list): the path under search through the recursive functions
- Returns:
path(list) : a list of nodes we visit from start node to end node in a directed path
ETIA.CRV.causal_graph_utils.one_path_anytype module
- one_path_anytype(matrix, start, end, path_=[])[source]
Recursive function to search for at least one path of any type from ‘start’ node to ‘end’ node Author : kbiza@csd.uoc.gr
Args: matrix(numpy array): matrix of size N*N where N is the number of nodes in tetrad_graph
matrix(i, j) = 2 and matrix(j, i) = 3: i–>j matrix(i, j) = 1 and matrix(j, i) = 1: io-oj matrix(i, j) = 2 and matrix(j, i) = 2: i<->j matrix(i, j) = 3 and matrix(j, i) = 3: i—j matrix(i, j) = 2 and matrix(j, i) = 1: io->j
start(int): the first node in the path end(int): the last node in the path path_ (list): the path under search through the recursive functions
- Returns:
path(list) : a list of nodes we visit from start node to end node in a path
ETIA.CRV.causal_graph_utils.one_potentially_directed_path module
- one_potentially_directed_path(matrix, start, end, path_=[])[source]
Recursive function to search for at least one potentially directed path from ‘start’ node to ‘end’ node Author : kbiza@csd.uoc.gr :param matrix: matrix of size N*N where N is the number of nodes in tetrad_graph
matrix(i, j) = 2 and matrix(j, i) = 3: i–>j matrix(i, j) = 1 and matrix(j, i) = 1: io-oj matrix(i, j) = 2 and matrix(j, i) = 2: i<->j matrix(i, j) = 3 and matrix(j, i) = 3: i—j matrix(i, j) = 2 and matrix(j, i) = 1: io->j
- Parameters:
start (int) – the first node in the path
end (int) – the last node in the path
path (list) – the path under search through the recursive functions
- Returns:
- a list of nodes that appear in one potentially directed path from start node to end node
the path has not necessarily the minimum length
- Zhang Phd, 2007, page 108 :
for every 0<=i<=n-1 the edge between Vi and Vi+1 is not into Vi nor is out of Vi+1 intuitively : a path that could be oriented into a directed path by changing the
circles on the path into appropriate tails or arrowheads
- Return type:
path(list)
ETIA.CRV.causal_graph_utils.orientation_rules module
- R4(pag, mag, flag, verbose)[source]
Start from some node X, for node Y Visit all possible nodes X*->V & V->Y For every neighbour that is bi-directed and a parent of Y, continue For every neighbour that is bi-directed and o-*Y, orient and if parent continue Total: n*n*(n+m)
For each node Y, find all orientable neighbours W For each node X, non-adjacent to Y, see if there is a path to some node in W Create graph as follows: for X,Y edges X*->V & V -> Y –> X -> V edges A <-> B & A -> Y –> A -> B edges A <-* W & A -> Y –> A->W discriminating: if path from X to W
ETIA.CRV.causal_graph_utils.orientation_rules_cpdag module
ETIA.CRV.causal_graph_utils.pag_to_mag module
- pag_to_mag(pag_pd, verbose, n_lags=None)[source]
Converts PAG to MAG :param pag_pd (pandas Dataframe): :type pag_pd (pandas Dataframe): the matrix of the PAG :param verbose (bool): :param n_lags (int): :type n_lags (int): the maximum number of previous time lags in case of time-lagged graphs
- Returns:
mag_pd (pandas Dataframe)
- Return type:
the matrix of the MAG