gssl.graph package¶

Submodules¶

gssl.graph.gssl_affmat module¶

gssl_affmat.py¶

Module that handles the construction of affinity matrices.

class gssl.graph.gssl_affmat.AffMatGenerator(dist_func, mask_func, metric='euclidean', load_path=None, num_anchors=None, **arg)¶

Bases: object

Constructs a dense affinity matrix from some specification.

W_from_K(X, K)¶

__init__(dist_func, mask_func, metric='euclidean', load_path=None, num_anchors=None, **arg)¶

Constructs the Affinity Matrix Generator.

Parameters

X (NDArray[float].shape[N,D]) – A matrix containing the vertex positions.
dist_func (str) –
specifies the distance function to be used. Supported values: {
- gaussian: np.exp(-(d*d)/(2*sigma*sigma)), where d is the distance. Requires sigma on **kwargs.
- LNP: Linear neighborhood propagaton. Requires k on **kwargs.
- NLNP: Normalized reciprocal of Linear neighborhood propagation. Requires k on **kwargs.
- constant: Every weight is set to 1.
- inv_norm: 1/d, where d is the distance
}
mask_func (str) –
specifies the function used to determine the neighborhood. Supported Values: {
- epsilon: Epsilon-neighborhood. Requires eps on **args.
- knn: K-nearest neighbors Requires k on **args.
- load: loads CSR matrix specified by load_path
}
load_path (str) – Path from where to load CSR Knn matrix with precalculated distances.
metric (str) – specifies the metric when computing the distance. Default is euclidean. See the documentation of scipy.spatial.distance.cdist for more details.
**arg – Remaining arguments.

generateAffMat(X, Y=None, labeledIndexes=None, hook=None)¶

Generates the Affinity Matrix.

Returns: A dense affinity matrix.
Return type: `NDArray[float].shape[N,N]

get_or_calc_Mask(X)¶: Gets the previously computed mask for affinity matrix, or computes it.

handle_adaptive_sigma(K)¶

gssl.graph.gssl_affmat.LNP(X, K, symm=True)¶

Computes the edge weights through Linear Neighborhood Propagation.

Parameters

X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
K (NDArray[float].shape[N,N]) – Dense Affinity mask, whose positive entries correspond to neighbors.

Returns

A dense affinity matrix whose weights minimize the linear reconstruction of each instance.

Return type

NDArray[float].shape[N,N]

gssl.graph.gssl_affmat.NLNP(X, K, symm=True)¶

Computes the normalized reciprocals of the edge weights through Linear Neighborhood Propagation.

Parameters

X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
K (NDArray[float].shape[N,N]) – Dense Affinity mask, whose positive entries correspond to neighbors.

Returns

A dense affinity matrix whose weights are the normalized reciprocals of the ones given by Linear Neighborhood Propagation.

Return type

NDArray[float].shape[N,N]

gssl.graph.gssl_affmat.epsilonMask(X, eps, metric='euclidean')¶

Calculates the distances only in the epsilon-neighborhood.

Parameters

X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
eps (float) – A parameter such that K[i,j] = 0 if dist(X_i,X_j) >= eps.

Returns

a dense matrix ´K´ of shape [N,N] whose nonzero [i,j] entries correspond to distances between neighbors X[i,:],X[j,:] .

Return type

NDArray[int].shape[N,N]

gssl.graph.gssl_affmat.knnMask(X, k, mode='sym', metric='euclidean')¶

Calculates the distances only in the knn-neighborhood.

Parameters

X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.
k (int) – A parameter such that ´K[i,j] = 1´ iff X_i is one of the k-nearest neighbors of X_j
mode (str) –
type of KNN. Supported values: {
- mut.``K[i,j] = min(K[i,j],K[j,i])``.
- sym. K[i,j] = max(K[i,j],K[j,i]).
- none. No symmetrization. WARNING: Many GSSL algorithms depend on a symmetric affinity matrix.
}

Returns

a dense matrix ´K´ of shape [N,N] whose nonzero [i,j] entries correspond to distances between neighbors X[i,:],X[j,:] .

Return type

NDArray[int].shape[N,N]

gssl.graph.gssl_affmat.sort_coo(m)¶

gssl.graph.gssl_utils module¶

gssl_utils.py¶

Module containing utilities for GSSL algorithms.

gssl.graph.gssl_utils.accuracy(Y_pred, Y_true)¶: Calculates percentage of correct predictions.

gssl.graph.gssl_utils.accuracy_unlabeled(Y_pred, Y_true, labeled_indexes)¶: Calculates percentage of correct predictions on unlabeled indexes only.

gssl.graph.gssl_utils.calc_Z(Y, labeledIndexes, D, estimatedFreq=None, weigh_by_degree=False)¶

Calculates matrix Z used for GTAM/LDST label propagation.

Parameters

Y ([NDArray[int].shape[N,C]) – confidence matrix
labeledIndexes (NDArray[bool].shape[N]) – determines which indices are to be considered as labeled.
D ([NDArray[float].shape[N]) – array of diagonal entries of degree matrix.
estimatedFreq (´NDArray[float]shape[C]`) – Optional. The estimated class frequencies. If absent, it is assumed all classes occur equally often.
reciprocal (bool) – If True, use reciprocal of the degree instead. Default is False

Returns

Matrix Z, which normalizes Y by class frequencies and degree.

Return type

[NDArray[int].shape[N,C]

gssl.graph.gssl_utils.class_mass_normalization(F, Y, labeledIndexes, normalize_rows=True)¶

gssl.graph.gssl_utils.deg_matrix(W, pwr=1, flat=False, NA_replace_val=1.0)¶: Returns a diagonal matrix with the row-wise sums of a matrix W.

gssl.graph.gssl_utils.extract_lap_eigvec(L, m, D=None, remove_first_eig=True)¶

Extract m eigenvectors and eigenvalues of the laplacian, in non-decreasing order.

Parameters

L ([NDArray[float].shape[N,N]) – laplacian matrix
m (int) – number of eigenvectors to extract
D ([NDArray[float].shape[N,N]) – extra matrix for generalized eigenvalue problem

Returns

matrix of eigenvectors, and diagonal matrix of eigenvalues

Return type

Pair[NDArray[float]shape[M,N],NDArray[float]shape[M,M]]

gssl.graph.gssl_utils.get_Isomap(X, n_neighbors=5)¶

gssl.graph.gssl_utils.get_PCA(X)¶

gssl.graph.gssl_utils.get_Standardized(X)¶

gssl.graph.gssl_utils.get_pred(Y)¶

Calculates predictions from a belief matrix.

Parameters: Y (NDArray[float].shape[N,C]) – belief matrix.
Returns: prediction vector, each entry numbered from 0 to C-1.
Return type: NDArray[int].shape[N]

gssl.graph.gssl_utils.init_matrix(Y, labeledIndexes)¶

Creates a matrix containing the confidence for each class.

Parameters

Y ([NDArray[int].shape[N]) – array of true labels.
labeledIndexes (NDArray[bool].shape[N]) – determines which indices are to be considered as labeled.

Returns

A matrix init such that init[i,j] has the confidence that the i-th instance has the label corresponding to the j-th class.

Return type

[NDArray[float].shape[N,C]

gssl.graph.gssl_utils.init_matrix_argmax(Y)¶: Returns the argmax of each row of a matrix.

gssl.graph.gssl_utils.labels_indicator(labeledIndexes)¶: Returns a Diagonal matrix J indicating whether the instance is labeled or not :param labeledIndexes: :type labeledIndexes: [NDArray[bool].shape[N]

gssl.graph.gssl_utils.lap_matrix(W, is_normalized)¶

Returns the graph Laplacian of some matrix W.

Parameters

W (NDArray[float].shape[N,N]) – The given matrix.
is_normalized (bool) – If True, returns \(L = I - D^{-1/2} W D^{-1/2}\). Otherwise, returns \(L = D^{-1}W\).

Returns

The normalized or unnormalized graph Laplacian

Return type

NDArray[float].shape[N,N]

gssl.graph.gssl_utils.scipy_to_np(X)¶

gssl.graph.gssl_utils.split_indices(Y, split_p=0.5, seed=None)¶

Returns a percentage p of indices, using stratification.

Parameters

Y (NDArray.shape[N]) – the vector from which to split with stratification w.r.t. each number that appears.
split_p (float) – the percentage of stratified indexes to return
seed (float) – Optional. Used to reproduce results.

Returns

vector with split_p of indexes after stratified sampling.

Return type

NDArray.shape[N*split_p]

Raises

ValueError – if split_p is an invalid percentage.

gssl.graph package¶

Submodules¶

gssl.graph.gssl_affmat module¶

gssl_affmat.py¶

gssl.graph.gssl_utils module¶

gssl_utils.py¶

Module contents¶