gssl.filters package

Submodules

gssl.filters.LDST module

Created on 1 de abr de 2019

@author: klaus

class gssl.filters.LDST.LDST(tuning_iter, mu=99.0, useEstimatedFreq=True, constantProp=False, useZ=True, weigh_by_degree=False)

Bases: gssl.filters.filter.GSSLFilter

classdocs

LDST(*args, **kwargs)
__init__(tuning_iter, mu=99.0, useEstimatedFreq=True, constantProp=False, useZ=True, weigh_by_degree=False)

” Constructor for the LDST filter.

Parameters
  • mu (float) – a parameter determining the importance of the fitting term. Default is 99.0.

  • tuning_iter (int) – The number of tuning iterations.

  • useEstimatedFreq (Union[bool,NDArray[C],None]) – If True, then use estimated class freq. to balance the propagation. If it is a float array, it uses that as the frequency. If None, assumes classes are equiprobable. Default is True.

  • constantProp (bool) – If True, whenever a label of a given class is removed, another label from the same class gets added. Default is False.

  • useZ (bool) – If True, then at each step update label matrix so that each class has total influence equal to the estimated frequency. Default is True.

  • weigh_by_degree (bool) – If True and useZ` also True, then vertices with higher degree will have more confident labels. Default is False.

fit(X, Y, labeledIndexes, W=None, hook=None)

Filters the input data.

Parameters
  • X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.

  • Y (NDArray[float].shape[N,C]) – A (noisy) belief matrix

  • labeledIndexes (NDArray[bool].shape[N]) – Indices to be marked as labeled.

  • W (NDArray[float].shape[N,N]) – Optional. The affinity matrix encoding the weighted edges.

  • hook (GSSLHook) – Optional. A hook to execute extra operations (e.g. plots) during the algorithm

Returns

A corrected version of the belief matrix. NDArray[bool].shape[N]: Updated labeledIndexes.

Return type

NDArray[float].shape[N,C]

gssl.filters.LGC_LVO module

Created on 13 de nov de 2019

@author: klaus

class gssl.filters.LGC_LVO.LGC_LVO_Filter(tuning_iter, mu=99.0, useEstimatedFreq=True, constantProp=False, useZ=True, tuning_iter_as_pct=False, normalize_rows=True, early_stop=False, use_baseline=False, relabel=False)

Bases: gssl.filters.filter.GSSLFilter

“” LGCLVO_F is (leave-one-out filter based on local and global consistency). See [Afo20].

LGCLVO(*args, **kwargs)
__init__(tuning_iter, mu=99.0, useEstimatedFreq=True, constantProp=False, useZ=True, tuning_iter_as_pct=False, normalize_rows=True, early_stop=False, use_baseline=False, relabel=False)

Constructor for the LGCLVO filter.

Parameters
  • mu (float) – a parameter determining the importance of the fitting term. Default is 99.0.

  • tuning_iter (float) – The number of tuning iterations.

  • tuning_iter_as_pct (bool) – If True, then tuning_iter is to be interpreted as a percentage of the number of labels.

  • constantProp (bool) – If True, the number of labels detected for each class will be proportional to the estimated frequency. Default is False.

  • useEstimatedFreq (bool) – If True, then use estimated class freq. to balance the propagation. Otherwise, assume classes are equiprobable. Default is True.

  • useZ (bool) – If True, then normalize the label matrix at each step. Default is True.

  • normalize_row (bool) – If True, then each row of the classification matrix and label matrix will sum up to one. Highly recommended. Default is True.

  • early_stop (bool) – If True, a label will not be considered for removal if it cannot be reached by other labels. Default is False.

  • use_baseline (bool) – If True, we will use a baseline which calculates the criteria once instead of updating at each iteration, sacrificing precision for performance. Default is False.

  • relabel (bool) – If False, the relevant label indices are removed. If True, the returned label matrix is directly modified to keep indices, but changing the label. Default is False.

fit(X, Y, labeledIndexes, W=None, hook=None)

Filters the input data.

Parameters
  • X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.

  • Y (NDArray[float].shape[N,C]) – A (noisy) belief matrix

  • labeledIndexes (NDArray[bool].shape[N]) – Indices to be marked as labeled.

  • W (NDArray[float].shape[N,N]) – Optional. The affinity matrix encoding the weighted edges.

  • hook (GSSLHook) – Optional. A hook to execute extra operations (e.g. plots) during the algorithm

Returns

A corrected version of the belief matrix. NDArray[bool].shape[N]: Updated labeledIndexes.

Return type

NDArray[float].shape[N,C]

gssl.filters.MRremoval module

Created on 1 de abr de 2019

@author: klaus

class gssl.filters.MRremoval.MRRemover(p=0.2, tuning_iter=0, tuning_iter_as_pct=False)

Bases: gssl.filters.filter.GSSLFilter

__init__(p=0.2, tuning_iter=0, tuning_iter_as_pct=False)

” Constructor for Manifold Regularization Filter.

fit(X, Y, labeledIndexes, W=None, hook=None)

Filters the input data.

Parameters
  • X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.

  • Y (NDArray[float].shape[N,C]) – A (noisy) belief matrix

  • labeledIndexes (NDArray[bool].shape[N]) – Indices to be marked as labeled.

  • W (NDArray[float].shape[N,N]) – Optional. The affinity matrix encoding the weighted edges.

  • hook (GSSLHook) – Optional. A hook to execute extra operations (e.g. plots) during the algorithm

Returns

A corrected version of the belief matrix. NDArray[bool].shape[N]: Updated labeledIndexes.

Return type

NDArray[float].shape[N,C]

gssl.filters.filter module

class gssl.filters.filter.GSSLFilter

Bases: object

Skeleton class for GSSL Filters.

classmethod autohooks(fun)

Automatically calls the begin and end method of the hook. At the end, the filtered labels are passed as ‘Y’, and the new labeled indexes as ‘labeledIndexes’.

fit(X, Y, labeledIndexes, W=None, hook=None)

Filters the input data.

Parameters
  • X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.

  • Y (NDArray[float].shape[N,C]) – A (noisy) belief matrix

  • labeledIndexes (NDArray[bool].shape[N]) – Indices to be marked as labeled.

  • W (NDArray[float].shape[N,N]) – Optional. The affinity matrix encoding the weighted edges.

  • hook (GSSLHook) – Optional. A hook to execute extra operations (e.g. plots) during the algorithm

Returns

A corrected version of the belief matrix. NDArray[bool].shape[N]: Updated labeledIndexes.

Return type

NDArray[float].shape[N,C]

gssl.filters.filter_utils module

filter_utils.py

Module containing utilities for GSSL filters.

gssl.filters.filter_utils.get_confmat_FN(A)
gssl.filters.filter_utils.get_confmat_FP(A)
gssl.filters.filter_utils.get_confmat_TN(A)
gssl.filters.filter_utils.get_confmat_TP(A)
gssl.filters.filter_utils.get_confmat_acc(A)
gssl.filters.filter_utils.get_confmat_dict(A)
gssl.filters.filter_utils.get_confmat_f1_score(A)
gssl.filters.filter_utils.get_confmat_npv(A)
gssl.filters.filter_utils.get_confmat_precision(A)
gssl.filters.filter_utils.get_confmat_recall(A)
gssl.filters.filter_utils.get_confmat_specificity(A)
gssl.filters.filter_utils.get_unlabeling_confmat(Y_true, Y_n, Y_f, lb_n, lb_f)

Gets the confusion matrix related to the labels removed by the filter.

More specifically, a matrix M is returned:

M  =  [TN:#(clean labels NOT removed by filter)  FN:#(noisy labels NOT removed by filter)  ]
      [FP:#(clean labels removed by filter)  TP:#(noisy labels removed by filter) ]
Parameters
  • Y_true (NDArray[float].shape[N,C]) – A belief matrix encoding the true labels.

  • Y_n (NDArray[float].shape[N,C]) – A belief matrix encoding the noisy labels.

  • Y_n – A belief matrix encoding the filtered labels.

  • lb_n (NDArray[bool].shape[N]) – Indices of the noisy label matrix to be marked as labeled.

  • lb_f (NDArray[bool].shape[N]) – Indices of the filtered label matrix to be marked as labeled.

Returns

The matrix M.

Return type

NDArray[float].shape[N,C]

gssl.filters.ldstRemoval module

Created on 1 de abr de 2019

@author: klaus

class gssl.filters.ldstRemoval.LDSTRemover(tuning_iter, mu=99.0, useEstimatedFreq=True, constantProp=False, useZ=True, tuning_iter_as_pct=False, know_true_freq=False, weigh_by_degree=True, gradient_fix=True)

Bases: gssl.filters.filter.GSSLFilter

classdocs

LDST(*args, **kwargs)
__init__(tuning_iter, mu=99.0, useEstimatedFreq=True, constantProp=False, useZ=True, tuning_iter_as_pct=False, know_true_freq=False, weigh_by_degree=True, gradient_fix=True)

Constructor for LDST-Removal Filter.

Parameters
  • mu (float) – a parameter determining the importance of the fitting term. Default is 99.0.

  • tuning_iter (Union[int,float]) – The number of tuning iterations.

  • tuning_iter_as_pct (bool) – If True, then tuning_iter is to be interpreted as a percentage of the number of labels.

  • useEstimatedFreq (Union[bool,NDArray[C],None]) – If True, then use estimated class freq. to balance the propagation. If it is a float array, it uses that as the frequency. If None, assumes classes are equiprobable. Default is True.

  • constantProp (bool) – If True, whenever a label of a given class is removed, another label from the same class gets added. Default is False.

  • useZ (bool) – If True, then at each step update label matrix so that each class has total influence equal to the estimated frequency. Default is True.

  • weigh_by_degree (bool) – If True and useZ also True, then vertice with higher degree will have more confident labels. Default is False.

  • gradient_fix (bool) – If True, changes the criteria when selecting the Q matrix to discourage picking the same

  • Default is True, and should be kept that way for better performance. (class.) –

fit(X, Y, labeledIndexes, W=None, hook=None)

Filters the input data.

Parameters
  • X (NDArray[float].shape[N,D]) – Input matrix of N instances of dimension D.

  • Y (NDArray[float].shape[N,C]) – A (noisy) belief matrix

  • labeledIndexes (NDArray[bool].shape[N]) – Indices to be marked as labeled.

  • W (NDArray[float].shape[N,N]) – Optional. The affinity matrix encoding the weighted edges.

  • hook (GSSLHook) – Optional. A hook to execute extra operations (e.g. plots) during the algorithm

Returns

A corrected version of the belief matrix. NDArray[bool].shape[N]: Updated labeledIndexes.

Return type

NDArray[float].shape[N,C]

Module contents