gssl.graph package¶
Submodules¶
gssl.graph.gssl_affmat module¶
gssl_affmat.py¶
Module that handles the construction of affinity matrices.
-
class
gssl.graph.gssl_affmat.AffMatGenerator(dist_func, mask_func, metric='euclidean', load_path=None, num_anchors=None, **arg)¶ Bases:
objectConstructs a dense affinity matrix from some specification.
-
W_from_K(X, K)¶
-
__init__(dist_func, mask_func, metric='euclidean', load_path=None, num_anchors=None, **arg)¶ Constructs the Affinity Matrix Generator.
- Parameters
X (NDArray[float].shape[N,D]) – A matrix containing the vertex positions.
dist_func (str) –
specifies the distance function to be used. Supported values: {
gaussian:
np.exp(-(d*d)/(2*sigma*sigma)), where d is the distance. Requiressigmaon **kwargs.LNP: Linear neighborhood propagaton. Requires
kon **kwargs.NLNP: Normalized reciprocal of Linear neighborhood propagation. Requires
kon **kwargs.constant: Every weight is set to 1.
inv_norm:
1/d, where d is the distance
}
mask_func (str) –
specifies the function used to determine the neighborhood. Supported Values: {
}
load_path (str) – Path from where to load CSR Knn matrix with precalculated distances.
metric (str) – specifies the metric when computing the distance. Default is euclidean. See the documentation of scipy.spatial.distance.cdist for more details.
**arg – Remaining arguments.
-
generateAffMat(X, Y=None, labeledIndexes=None, hook=None)¶ Generates the Affinity Matrix.
- Returns
A dense affinity matrix.
- Return type
`NDArray[float].shape[N,N]
-
get_or_calc_Mask(X)¶ Gets the previously computed mask for affinity matrix, or computes it.
-
handle_adaptive_sigma(K)¶
-
-
gssl.graph.gssl_affmat.LNP(X, K, symm=True)¶ Computes the edge weights through Linear Neighborhood Propagation.
- Parameters
X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
K (NDArray[float].shape[N,N]) – Dense Affinity mask, whose positive entries correspond to neighbors.
- Returns
A dense affinity matrix whose weights minimize the linear reconstruction of each instance.
- Return type
NDArray[float].shape[N,N]
-
gssl.graph.gssl_affmat.NLNP(X, K, symm=True)¶ Computes the normalized reciprocals of the edge weights through Linear Neighborhood Propagation.
- Parameters
X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
K (NDArray[float].shape[N,N]) – Dense Affinity mask, whose positive entries correspond to neighbors.
- Returns
A dense affinity matrix whose weights are the normalized reciprocals of the ones given by Linear Neighborhood Propagation.
- Return type
NDArray[float].shape[N,N]
-
gssl.graph.gssl_affmat.epsilonMask(X, eps, metric='euclidean')¶ Calculates the distances only in the epsilon-neighborhood.
- Parameters
X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
eps (float) – A parameter such that K[i,j] = 0 if dist(X_i,X_j) >= eps.
- Returns
a dense matrix ´K´ of shape [N,N] whose nonzero [i,j] entries correspond to distances between neighbors X[i,:],X[j,:] .
- Return type
NDArray[int].shape[N,N]
-
gssl.graph.gssl_affmat.knnMask(X, k, mode='sym', metric='euclidean')¶ Calculates the distances only in the knn-neighborhood.
- Parameters
X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
k (int) – A parameter such that ´K[i,j] = 1´ iff X_i is one of the k-nearest neighbors of X_j
mode (str) –
type of KNN. Supported values: {
mut.``K[i,j] = min(K[i,j],K[j,i])``.sym.K[i,j] = max(K[i,j],K[j,i]).none. No symmetrization. WARNING: Many GSSL algorithms depend on a symmetric affinity matrix.
}
- Returns
a dense matrix ´K´ of shape [N,N] whose nonzero [i,j] entries correspond to distances between neighbors X[i,:],X[j,:] .
- Return type
NDArray[int].shape[N,N]
-
gssl.graph.gssl_affmat.sort_coo(m)¶
gssl.graph.gssl_utils module¶
gssl_utils.py¶
Module containing utilities for GSSL algorithms.
-
gssl.graph.gssl_utils.accuracy(Y_pred, Y_true)¶ Calculates percentage of correct predictions.
-
gssl.graph.gssl_utils.accuracy_unlabeled(Y_pred, Y_true, labeled_indexes)¶ Calculates percentage of correct predictions on unlabeled indexes only.
-
gssl.graph.gssl_utils.calc_Z(Y, labeledIndexes, D, estimatedFreq=None, weigh_by_degree=False)¶ Calculates matrix Z used for GTAM/LDST label propagation.
- Parameters
Y ([NDArray[int].shape[N,C]) – confidence matrix
labeledIndexes (NDArray[bool].shape[N]) – determines which indices are to be considered as labeled.
D ([NDArray[float].shape[N]) – array of diagonal entries of degree matrix.
estimatedFreq (´NDArray[float]shape[C]`) – Optional. The estimated class frequencies. If absent, it is assumed all classes occur equally often.
reciprocal (bool) – If
True, use reciprocal of the degree instead. Default isFalse
- Returns
Matrix Z, which normalizes Y by class frequencies and degree.
- Return type
[NDArray[int].shape[N,C]
-
gssl.graph.gssl_utils.class_mass_normalization(F, Y, labeledIndexes, normalize_rows=True)¶
-
gssl.graph.gssl_utils.deg_matrix(W, pwr=1, flat=False, NA_replace_val=1.0)¶ Returns a diagonal matrix with the row-wise sums of a matrix W.
-
gssl.graph.gssl_utils.extract_lap_eigvec(L, m, D=None, remove_first_eig=True)¶ Extract
meigenvectors and eigenvalues of the laplacian, in non-decreasing order.- Parameters
L ([NDArray[float].shape[N,N]) – laplacian matrix
m (int) – number of eigenvectors to extract
D ([NDArray[float].shape[N,N]) – extra matrix for generalized eigenvalue problem
- Returns
matrix of eigenvectors, and diagonal matrix of eigenvalues
- Return type
Pair[NDArray[float]shape[M,N],NDArray[float]shape[M,M]]
-
gssl.graph.gssl_utils.get_Isomap(X, n_neighbors=5)¶
-
gssl.graph.gssl_utils.get_PCA(X)¶
-
gssl.graph.gssl_utils.get_Standardized(X)¶
-
gssl.graph.gssl_utils.get_pred(Y)¶ Calculates predictions from a belief matrix.
- Parameters
Y (NDArray[float].shape[N,C]) – belief matrix.
- Returns
prediction vector, each entry numbered from 0 to C-1.
- Return type
NDArray[int].shape[N]
-
gssl.graph.gssl_utils.init_matrix(Y, labeledIndexes)¶ Creates a matrix containing the confidence for each class.
- Parameters
Y ([NDArray[int].shape[N]) – array of true labels.
labeledIndexes (NDArray[bool].shape[N]) – determines which indices are to be considered as labeled.
- Returns
A matrix init such that init[i,j] has the confidence that the i-th instance has the label corresponding to the j-th class.
- Return type
[NDArray[float].shape[N,C]
-
gssl.graph.gssl_utils.init_matrix_argmax(Y)¶ Returns the argmax of each row of a matrix.
-
gssl.graph.gssl_utils.labels_indicator(labeledIndexes)¶ Returns a Diagonal matrix J indicating whether the instance is labeled or not :param labeledIndexes: :type labeledIndexes: [NDArray[bool].shape[N]
-
gssl.graph.gssl_utils.lap_matrix(W, is_normalized)¶ Returns the graph Laplacian of some matrix W.
- Parameters
W (NDArray[float].shape[N,N]) – The given matrix.
is_normalized (bool) – If
True, returns \(L = I - D^{-1/2} W D^{-1/2}\). Otherwise, returns \(L = D^{-1}W\).
- Returns
The normalized or unnormalized graph Laplacian
- Return type
NDArray[float].shape[N,N]
-
gssl.graph.gssl_utils.scipy_to_np(X)¶
-
gssl.graph.gssl_utils.split_indices(Y, split_p=0.5, seed=None)¶ Returns a percentage p of indices, using stratification.
- Parameters
Y (NDArray.shape[N]) – the vector from which to split with stratification w.r.t. each number that appears.
split_p (float) – the percentage of stratified indexes to return
seed (float) – Optional. Used to reproduce results.
- Returns
vector with split_p of indexes after stratified sampling.
- Return type
NDArray.shape[N*split_p]
- Raises
ValueError – if split_p is an invalid percentage.