CPU/Numpy Backend

CPU/Numpy backend for GMM training and inference

Reference

class ggmm.cpu.GMM(n_components, n_dimensions, covariance_type='diag', min_covar=0.001, verbose=False)

Gaussian Mixture Model

Representation of a Gaussian mixture model probability distribution. This class allows for easy evaluation of, sampling from, and maximum-likelihood estimation of the parameters of a GMM distribution.

Initializes parameters such that every mixture component has zero mean and identity covariance.

Parameters:

n_components : int, required

Number of mixture components.

n_dimensions : int, required

Number of data dimensions.

covariance_type : string, optional

String describing the type of covariance parameters to use. Must be one of ‘spherical’, ‘tied’, ‘diag’, ‘full’. Defaults to ‘diag’.

min_covar : float, optional

Floor on the diagonal of the covariance matrix to prevent overfitting. Defaults to 1e-3.

Methods

aic(X)

Akaike information criterion for the current model fit and the proposed data

Parameters:X : array of shape(n_samples, n_dimensions)
Returns:aic: float (the lower the better) :
bic(X)

Bayesian information criterion for the current model fit and the proposed data

Parameters:X : array of shape(n_samples, n_dimensions)
Returns:bic: float (the lower the better) :
compute_posteriors(X)

Predict posterior probability of data under each Gaussian in the model.

Parameters:

X : array-like, shape = [n_samples, n_dimensions]

Returns:

posteriors : array-like, shape = (n_samples, K)

Returns the probability of the sample for each Gaussian (state) in the model.

fit(X, thresh=0.01, n_iter=100, n_init=1, update_params='wmc', init_params='', random_state=None, verbose=None)

Estimate model parameters with the expectation-maximization algorithm.

A initialization step is performed before entering the em algorithm. If you want to avoid this step, set the keyword argument init_params to the empty string ‘’ when creating the GMM object. Likewise, if you would like just to do an initialization, set n_iter=0.

Parameters:

X : array_like, shape (n_samples, n_dimensions)

List of ‘n_samples’ data points. Each row corresponds to a single data point.

thresh : float, optional

Convergence threshold.

n_iter : int, optional

Number of EM iterations to perform.

n_init : int, optional

Number of initializations to perform. the best results is kept

update_params : string, optional

Controls which parameters are updated in the training process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars. Defaults to ‘wmc’.

init_params : string, optional

Controls which parameters are updated in the initialization process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars. Defaults to ‘wmc’.

random_state: numpy.random.RandomState :

verbose: bool, optional :

Whether to print EM iteration information during training

get_covars()

Return current means as numpy array

Returns:

covars : np.ndarray, shape (n_components, n_dimensions)

(for now only diagonal covariance matrices are supported)

get_means()

Return current means as numpy array

Returns:means : np.ndarray, shape (n_components, n_dimensions)
get_weights()

Return current weight vector as numpy array

Returns:weights : np.ndarray, shape (n_components,)
predict(X)

Predict label for data.

Parameters:X : array-like, shape = [n_samples, n_dimensions]
Returns:C : array, shape = (n_samples,)
sample(n_samples=1, random_state=None)

Generate random samples from the model.

Parameters:

n_samples : int, optional

Number of samples to generate. Defaults to 1.

Returns:

X : array_like, shape (n_samples, n_dimensions)

List of samples

score(X)

Compute the log probability under the model.

Parameters:

X : array_like, shape (n_samples, n_dimensions)

List of ‘n_samples’ data points. Each row corresponds to a single data point.

Returns:

logprob : array_like, shape (n_samples,)

Log probabilities of each data point in X

score_samples(X)

Return the per-sample likelihood of the data under the model.

Compute the log probability of X under the model and return the posterior probability of each mixture component for each element of X.

Parameters:

X: numpy.ndarray, shape (n_samples, n_dimensions) :

Array of n_samples data points. Each row corresponds to a single data point.

Returns:

logprob : array_like, shape (n_samples,)

Log probabilities of each data point in X.

posteriors : array_like, shape (n_samples, n_components)

Posterior probabilities of each mixture component for each sample

set_covars(covars)

Set covariance matrices with numpy array

Parameters:

covars: numpy.ndarray, shape (n_components, n_dimensions) :

(for now only diagonal covariance matrices are supported)

set_means(means)

Set mean vectors with numpy array.

Parameters:means: numpy.ndarray, shape (n_components, n_dimensions) :
set_weights(weights)

Set weight vector with numpy array.

Parameters:weights: numpy.ndarray, shape (n_components,) :