gssl.graph package

Submodules

gssl.graph.gssl_affmat module

gssl_affmat.py

Module that handles the construction of affinity matrices.

class gssl.graph.gssl_affmat.AffMatGenerator(dist_func, mask_func, metric='euclidean', load_path=None, num_anchors=None, **arg)

Bases: object

Constructs a dense affinity matrix from some specification.

W_from_K(X, K)
__init__(dist_func, mask_func, metric='euclidean', load_path=None, num_anchors=None, **arg)

Constructs the Affinity Matrix Generator.

Parameters
  • X (NDArray[float].shape[N,D]) – A matrix containing the vertex positions.

  • dist_func (str) –

    specifies the distance function to be used. Supported values: {

    • gaussian: np.exp(-(d*d)/(2*sigma*sigma)), where d is the distance. Requires sigma on **kwargs.

    • LNP: Linear neighborhood propagaton. Requires k on **kwargs.

    • NLNP: Normalized reciprocal of Linear neighborhood propagation. Requires k on **kwargs.

    • constant: Every weight is set to 1.

    • inv_norm: 1/d, where d is the distance

    }

  • mask_func (str) –

    specifies the function used to determine the neighborhood. Supported Values: {

    • epsilon: Epsilon-neighborhood. Requires eps on **args.

    • knn: K-nearest neighbors Requires k on **args.

    • load: loads CSR matrix specified by load_path

    }

  • load_path (str) – Path from where to load CSR Knn matrix with precalculated distances.

  • metric (str) – specifies the metric when computing the distance. Default is euclidean. See the documentation of scipy.spatial.distance.cdist for more details.

  • **arg – Remaining arguments.

generateAffMat(X, Y=None, labeledIndexes=None, hook=None)

Generates the Affinity Matrix.

Returns

A dense affinity matrix.

Return type

`NDArray[float].shape[N,N]

get_or_calc_Mask(X)

Gets the previously computed mask for affinity matrix, or computes it.

handle_adaptive_sigma(K)
gssl.graph.gssl_affmat.LNP(X, K, symm=True)

Computes the edge weights through Linear Neighborhood Propagation.

Parameters
  • X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.

  • K (NDArray[float].shape[N,N]) – Dense Affinity mask, whose positive entries correspond to neighbors.

Returns

A dense affinity matrix whose weights minimize the linear reconstruction of each instance.

Return type

NDArray[float].shape[N,N]

gssl.graph.gssl_affmat.NLNP(X, K, symm=True)

Computes the normalized reciprocals of the edge weights through Linear Neighborhood Propagation.

Parameters
  • X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.

  • K (NDArray[float].shape[N,N]) – Dense Affinity mask, whose positive entries correspond to neighbors.

Returns

A dense affinity matrix whose weights are the normalized reciprocals of the ones given by Linear Neighborhood Propagation.

Return type

NDArray[float].shape[N,N]

gssl.graph.gssl_affmat.epsilonMask(X, eps, metric='euclidean')

Calculates the distances only in the epsilon-neighborhood.

Parameters
  • X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.

  • eps (float) – A parameter such that K[i,j] = 0 if dist(X_i,X_j) >= eps.

Returns

a dense matrix ´K´ of shape [N,N] whose nonzero [i,j] entries correspond to distances between neighbors X[i,:],X[j,:] .

Return type

NDArray[int].shape[N,N]

gssl.graph.gssl_affmat.knnMask(X, k, mode='sym', metric='euclidean')

Calculates the distances only in the knn-neighborhood.

Parameters
  • X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.

  • k (int) – A parameter such that ´K[i,j] = 1´ iff X_i is one of the k-nearest neighbors of X_j

  • mode (str) –

    type of KNN. Supported values: {

    • mut.``K[i,j] = min(K[i,j],K[j,i])``.

    • sym. K[i,j] = max(K[i,j],K[j,i]).

    • none. No symmetrization. WARNING: Many GSSL algorithms depend on a symmetric affinity matrix.

    }

Returns

a dense matrix ´K´ of shape [N,N] whose nonzero [i,j] entries correspond to distances between neighbors X[i,:],X[j,:] .

Return type

NDArray[int].shape[N,N]

gssl.graph.gssl_affmat.sort_coo(m)

gssl.graph.gssl_utils module

gssl_utils.py

Module containing utilities for GSSL algorithms.

gssl.graph.gssl_utils.accuracy(Y_pred, Y_true)

Calculates percentage of correct predictions.

gssl.graph.gssl_utils.accuracy_unlabeled(Y_pred, Y_true, labeled_indexes)

Calculates percentage of correct predictions on unlabeled indexes only.

gssl.graph.gssl_utils.calc_Z(Y, labeledIndexes, D, estimatedFreq=None, weigh_by_degree=False)

Calculates matrix Z used for GTAM/LDST label propagation.

Parameters
  • Y ([NDArray[int].shape[N,C]) – confidence matrix

  • labeledIndexes (NDArray[bool].shape[N]) – determines which indices are to be considered as labeled.

  • D ([NDArray[float].shape[N]) – array of diagonal entries of degree matrix.

  • estimatedFreq (´NDArray[float]shape[C]`) – Optional. The estimated class frequencies. If absent, it is assumed all classes occur equally often.

  • reciprocal (bool) – If True, use reciprocal of the degree instead. Default is False

Returns

Matrix Z, which normalizes Y by class frequencies and degree.

Return type

[NDArray[int].shape[N,C]

gssl.graph.gssl_utils.class_mass_normalization(F, Y, labeledIndexes, normalize_rows=True)
gssl.graph.gssl_utils.deg_matrix(W, pwr=1, flat=False, NA_replace_val=1.0)

Returns a diagonal matrix with the row-wise sums of a matrix W.

gssl.graph.gssl_utils.extract_lap_eigvec(L, m, D=None, remove_first_eig=True)

Extract m eigenvectors and eigenvalues of the laplacian, in non-decreasing order.

Parameters
  • L ([NDArray[float].shape[N,N]) – laplacian matrix

  • m (int) – number of eigenvectors to extract

  • D ([NDArray[float].shape[N,N]) – extra matrix for generalized eigenvalue problem

Returns

matrix of eigenvectors, and diagonal matrix of eigenvalues

Return type

Pair[NDArray[float]shape[M,N],NDArray[float]shape[M,M]]

gssl.graph.gssl_utils.get_Isomap(X, n_neighbors=5)
gssl.graph.gssl_utils.get_PCA(X)
gssl.graph.gssl_utils.get_Standardized(X)
gssl.graph.gssl_utils.get_pred(Y)

Calculates predictions from a belief matrix.

Parameters

Y (NDArray[float].shape[N,C]) – belief matrix.

Returns

prediction vector, each entry numbered from 0 to C-1.

Return type

NDArray[int].shape[N]

gssl.graph.gssl_utils.init_matrix(Y, labeledIndexes)

Creates a matrix containing the confidence for each class.

Parameters
  • Y ([NDArray[int].shape[N]) – array of true labels.

  • labeledIndexes (NDArray[bool].shape[N]) – determines which indices are to be considered as labeled.

Returns

A matrix init such that init[i,j] has the confidence that the i-th instance has the label corresponding to the j-th class.

Return type

[NDArray[float].shape[N,C]

gssl.graph.gssl_utils.init_matrix_argmax(Y)

Returns the argmax of each row of a matrix.

gssl.graph.gssl_utils.labels_indicator(labeledIndexes)

Returns a Diagonal matrix J indicating whether the instance is labeled or not :param labeledIndexes: :type labeledIndexes: [NDArray[bool].shape[N]

gssl.graph.gssl_utils.lap_matrix(W, is_normalized)

Returns the graph Laplacian of some matrix W.

Parameters
  • W (NDArray[float].shape[N,N]) – The given matrix.

  • is_normalized (bool) – If True, returns \(L = I - D^{-1/2} W D^{-1/2}\). Otherwise, returns \(L = D^{-1}W\).

Returns

The normalized or unnormalized graph Laplacian

Return type

NDArray[float].shape[N,N]

gssl.graph.gssl_utils.scipy_to_np(X)
gssl.graph.gssl_utils.split_indices(Y, split_p=0.5, seed=None)

Returns a percentage p of indices, using stratification.

Parameters
  • Y (NDArray.shape[N]) – the vector from which to split with stratification w.r.t. each number that appears.

  • split_p (float) – the percentage of stratified indexes to return

  • seed (float) – Optional. Used to reproduce results.

Returns

vector with split_p of indexes after stratified sampling.

Return type

NDArray.shape[N*split_p]

Raises

ValueError – if split_p is an invalid percentage.

Module contents