gssl.graph package¶
Submodules¶
gssl.graph.gssl_affmat module¶
gssl_affmat.py¶
Module that handles the construction of affinity matrices.
-
class
gssl.graph.gssl_affmat.
AffMatGenerator
(dist_func, mask_func, metric='euclidean', load_path=None, num_anchors=None, **arg)¶ Bases:
object
Constructs a dense affinity matrix from some specification.
-
W_from_K
(X, K)¶
-
__init__
(dist_func, mask_func, metric='euclidean', load_path=None, num_anchors=None, **arg)¶ Constructs the Affinity Matrix Generator.
- Parameters
X (NDArray[float].shape[N,D]) – A matrix containing the vertex positions.
dist_func (str) –
specifies the distance function to be used. Supported values: {
gaussian:
np.exp(-(d*d)/(2*sigma*sigma))
, where d is the distance. Requiressigma
on **kwargs.LNP: Linear neighborhood propagaton. Requires
k
on **kwargs.NLNP: Normalized reciprocal of Linear neighborhood propagation. Requires
k
on **kwargs.constant: Every weight is set to 1.
inv_norm:
1/d
, where d is the distance
}
mask_func (str) –
specifies the function used to determine the neighborhood. Supported Values: {
}
load_path (str) – Path from where to load CSR Knn matrix with precalculated distances.
metric (str) – specifies the metric when computing the distance. Default is euclidean. See the documentation of scipy.spatial.distance.cdist for more details.
**arg – Remaining arguments.
-
generateAffMat
(X, Y=None, labeledIndexes=None, hook=None)¶ Generates the Affinity Matrix.
- Returns
A dense affinity matrix.
- Return type
`NDArray[float].shape[N,N]
-
get_or_calc_Mask
(X)¶ Gets the previously computed mask for affinity matrix, or computes it.
-
handle_adaptive_sigma
(K)¶
-
-
gssl.graph.gssl_affmat.
LNP
(X, K, symm=True)¶ Computes the edge weights through Linear Neighborhood Propagation.
- Parameters
X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
K (NDArray[float].shape[N,N]) – Dense Affinity mask, whose positive entries correspond to neighbors.
- Returns
A dense affinity matrix whose weights minimize the linear reconstruction of each instance.
- Return type
NDArray[float].shape[N,N]
-
gssl.graph.gssl_affmat.
NLNP
(X, K, symm=True)¶ Computes the normalized reciprocals of the edge weights through Linear Neighborhood Propagation.
- Parameters
X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
K (NDArray[float].shape[N,N]) – Dense Affinity mask, whose positive entries correspond to neighbors.
- Returns
A dense affinity matrix whose weights are the normalized reciprocals of the ones given by Linear Neighborhood Propagation.
- Return type
NDArray[float].shape[N,N]
-
gssl.graph.gssl_affmat.
epsilonMask
(X, eps, metric='euclidean')¶ Calculates the distances only in the epsilon-neighborhood.
- Parameters
X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
eps (float) – A parameter such that K[i,j] = 0 if dist(X_i,X_j) >= eps.
- Returns
a dense matrix ´K´ of shape [N,N] whose nonzero [i,j] entries correspond to distances between neighbors X[i,:],X[j,:] .
- Return type
NDArray[int].shape[N,N]
-
gssl.graph.gssl_affmat.
knnMask
(X, k, mode='sym', metric='euclidean')¶ Calculates the distances only in the knn-neighborhood.
- Parameters
X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
k (int) – A parameter such that ´K[i,j] = 1´ iff X_i is one of the k-nearest neighbors of X_j
mode (str) –
type of KNN. Supported values: {
mut
.``K[i,j] = min(K[i,j],K[j,i])``.sym
.K[i,j] = max(K[i,j],K[j,i])
.none
. No symmetrization. WARNING: Many GSSL algorithms depend on a symmetric affinity matrix.
}
- Returns
a dense matrix ´K´ of shape [N,N] whose nonzero [i,j] entries correspond to distances between neighbors X[i,:],X[j,:] .
- Return type
NDArray[int].shape[N,N]
-
gssl.graph.gssl_affmat.
sort_coo
(m)¶
gssl.graph.gssl_utils module¶
gssl_utils.py¶
Module containing utilities for GSSL algorithms.
-
gssl.graph.gssl_utils.
accuracy
(Y_pred, Y_true)¶ Calculates percentage of correct predictions.
-
gssl.graph.gssl_utils.
accuracy_unlabeled
(Y_pred, Y_true, labeled_indexes)¶ Calculates percentage of correct predictions on unlabeled indexes only.
-
gssl.graph.gssl_utils.
calc_Z
(Y, labeledIndexes, D, estimatedFreq=None, weigh_by_degree=False)¶ Calculates matrix Z used for GTAM/LDST label propagation.
- Parameters
Y ([NDArray[int].shape[N,C]) – confidence matrix
labeledIndexes (NDArray[bool].shape[N]) – determines which indices are to be considered as labeled.
D ([NDArray[float].shape[N]) – array of diagonal entries of degree matrix.
estimatedFreq (´NDArray[float]shape[C]`) – Optional. The estimated class frequencies. If absent, it is assumed all classes occur equally often.
reciprocal (bool) – If
True
, use reciprocal of the degree instead. Default isFalse
- Returns
Matrix Z, which normalizes Y by class frequencies and degree.
- Return type
[NDArray[int].shape[N,C]
-
gssl.graph.gssl_utils.
class_mass_normalization
(F, Y, labeledIndexes, normalize_rows=True)¶
-
gssl.graph.gssl_utils.
deg_matrix
(W, pwr=1, flat=False, NA_replace_val=1.0)¶ Returns a diagonal matrix with the row-wise sums of a matrix W.
-
gssl.graph.gssl_utils.
extract_lap_eigvec
(L, m, D=None, remove_first_eig=True)¶ Extract
m
eigenvectors and eigenvalues of the laplacian, in non-decreasing order.- Parameters
L ([NDArray[float].shape[N,N]) – laplacian matrix
m (int) – number of eigenvectors to extract
D ([NDArray[float].shape[N,N]) – extra matrix for generalized eigenvalue problem
- Returns
matrix of eigenvectors, and diagonal matrix of eigenvalues
- Return type
Pair[NDArray[float]shape[M,N],NDArray[float]shape[M,M]]
-
gssl.graph.gssl_utils.
get_Isomap
(X, n_neighbors=5)¶
-
gssl.graph.gssl_utils.
get_PCA
(X)¶
-
gssl.graph.gssl_utils.
get_Standardized
(X)¶
-
gssl.graph.gssl_utils.
get_pred
(Y)¶ Calculates predictions from a belief matrix.
- Parameters
Y (NDArray[float].shape[N,C]) – belief matrix.
- Returns
prediction vector, each entry numbered from 0 to C-1.
- Return type
NDArray[int].shape[N]
-
gssl.graph.gssl_utils.
init_matrix
(Y, labeledIndexes)¶ Creates a matrix containing the confidence for each class.
- Parameters
Y ([NDArray[int].shape[N]) – array of true labels.
labeledIndexes (NDArray[bool].shape[N]) – determines which indices are to be considered as labeled.
- Returns
A matrix init such that init[i,j] has the confidence that the i-th instance has the label corresponding to the j-th class.
- Return type
[NDArray[float].shape[N,C]
-
gssl.graph.gssl_utils.
init_matrix_argmax
(Y)¶ Returns the argmax of each row of a matrix.
-
gssl.graph.gssl_utils.
labels_indicator
(labeledIndexes)¶ Returns a Diagonal matrix J indicating whether the instance is labeled or not :param labeledIndexes: :type labeledIndexes: [NDArray[bool].shape[N]
-
gssl.graph.gssl_utils.
lap_matrix
(W, is_normalized)¶ Returns the graph Laplacian of some matrix W.
- Parameters
W (NDArray[float].shape[N,N]) – The given matrix.
is_normalized (bool) – If
True
, returns \(L = I - D^{-1/2} W D^{-1/2}\). Otherwise, returns \(L = D^{-1}W\).
- Returns
The normalized or unnormalized graph Laplacian
- Return type
NDArray[float].shape[N,N]
-
gssl.graph.gssl_utils.
scipy_to_np
(X)¶
-
gssl.graph.gssl_utils.
split_indices
(Y, split_p=0.5, seed=None)¶ Returns a percentage p of indices, using stratification.
- Parameters
Y (NDArray.shape[N]) – the vector from which to split with stratification w.r.t. each number that appears.
split_p (float) – the percentage of stratified indexes to return
seed (float) – Optional. Used to reproduce results.
- Returns
vector with split_p of indexes after stratified sampling.
- Return type
NDArray.shape[N*split_p]
- Raises
ValueError – if split_p is an invalid percentage.