Relational Generalized Learning Vector Quantization

Implements relational generalized learning vector quantization (RGLVQ) as described in the paper

Hammer, B., Hofmann, D., Schleif, F., & Zhu, X. (2014). Learning vector quantization for (dis-)similarities. Neurocomputing, 131, 43-51. doi:10.1016/j.neucom.2013.05.054. URL: http://www.techfak.uni-bielefeld.de/~fschleif/pdf/nc_diss_2014.pdf

class proto_dist_ml.rglvq.RGLVQ(K, T=100, phi=None, phi_grad=None)

A relational generalized learning vector quantization model, which classifies data by assigning the label of the closest prototype. Prototypes are learned points, which are represented as convex combinations of the input data. Due to its purely distance-based definition, relational generalized vector quantization can deal with pure distance matrix input, as long as the input distance matrix is Euclidean (i.e. if and only if the corresponding similarity matrix via double centering is positive semi-definite). If that is not the case, negative distances may occur which can hurt the model during training.

K

The number of prototypes per class to be learned.

Type:int
T

The maximum number of BFGS optimization steps during training.

Type:int (optional, default=100)
phi

A squashing function to post-process each error term.

Type:function handle (optional, default=identity)
phi_grad

The gradient function corresponding to phi.

Type:function handle (optional, default=1)
_Alpha

A K * num_labels x m matrix sparse matrix storing the convex coefficients that describe the prototypes. describe the prototypes. This is not set by the user but during fit().

Type:scipy.sparse.csr_matrix
_z

A K x 1 vector storing the normalization constants -0.5*_Alpha[k, :] * D² * _Alpha[k, :].T for all k. This is not set by the user but during fit().

Type:array_like
_y

A K * num_labels dimensional vector storing the label for each prototype. This is not set by the user but during fit().

Type:array_like
_loss

The GLVQ loss after L-BFGS optimization. This is not set by the user but during fit().

Type:array_like
fit(D, y, is_squared=False)

Fits a relational generalized learning vector quantization model to the given distance matrix.

In more detail, the training optimizes the GLVQ loss function, which is given as

\[\sum_i \Phi\Big(\frac{d_i^+ - d_i^-}{d_i^+ + d_i^-}\Big)\]

where i is the data point index, d_i^+ is the squared distance of i to the closest prototype with the same label and d_i^- is the squared distance of the closest prototype with another label. Note that i is correctly classified if and only if d_i^+ < d_i^-, such that the loss is a ‘soft’ version of the classification error.

We optimize this loss via L-BFGS as implemented in scipy. Our learnble parameters are the convex coefficients _Alpha. We can optimize these because the distance between data point i and prototype k can be re-written as follows:

\[d_{i,k}^2 = \vec \alpha_k \cdot D_{:, i}^2 - 0.5 \cdot \vec \alpha_k \cdot D^2 \cdot \vec \alpha_k^T = \vec \alpha_k \cdot D_{:, i}^2 + z_k\]

where alpha_k is self._Alpha[k, :]. This yields the gradient

\[\nabla_{\vec \alpha_k} d_{i, k}^2 = D_{i, :}^2 - D^2 \cdot \vec \alpha_k\]

Note that each gradient computation is quadratic in the number of nonzero entries of _Alpha[k, :], which may be expensive.

Parameters:
  • D (array_like) – A m x m matrix of pairwise distances. Note that we have no preconditions for this matrix. It may even be asymmetric.
  • y (array_like) – A m dimensional label vector for the data points.
  • is_squared (bool (optional, default = False)) – If set to true, this method assumes D is already a matrix of squared distances. Otherwise, the distances get squared.
Returns:

self

Return type:

class proto_dist_ml.rglvq.RGLVQ

predict(D, is_squared=False)

Predicts the labels for the data represented by the given test-to-training distance matrix. Each datapoint will be assigned to the closest prototype.

Parameters:
  • D (array_like) – A n x m matrix of distances from the test to the training datapoints.
  • is_squared (bool (optional, default = False)) – If set to true, this method assumes D is already a matrix of squared distances. Otherwise, the distances get squared.
Returns:

y – a n-dimensional vector containing the predicted labels for each test datapoint.

Return type:

array_like