GPU/CUDAMat Backend

GPU/CUDAMat backend for GMM training and inference

Example Usage

Training a GMM:

import ggmm.gpu as ggmm

X = some_module.load_training_data()

# N - training examples
# D - data dimension
# K - number of GMM components
N, D = X.shape
K = 128

ggmm.init()
gmm = ggmm.GMM(K,D)

thresh = 1e-3 # convergence threshold
n_iter = 20 # maximum number of EM iterations
init_params = 'wmc' # initialize weights, means, and covariances

# train GMM
gmm.fit(X, thresh, n_iter, init_params=init_params)

# retrieve parameters from trained GMM
weights = gmm.get_weights()
means = gmm.get_means()
covars = gmm.get_covars()

# compute posteriors of data
posteriors = gmm.compute_posteriors(X)

Reference

ggmm.gpu.init(max_ones=262144)

Initialize GPU resources.

Parameters:

max_ones : int, optional

Allocate enough memory for a sum of up to ‘max_ones’ length

ggmm.gpu.shutdown()

Free GPU resources

class ggmm.gpu.GMM(n_components, n_dimensions, covariance_type='diag', min_covar=0.001, verbose=False)

Gaussian Mixture Model

Representation of a Gaussian mixture model probability distribution. This class allows for easy evaluation of, sampling from, and maximum-likelihood estimation of the parameters of a GMM distribution.

Initializes parameters such that every mixture component has zero mean and identity covariance.

Parameters:

n_components : int, required

Number of mixture components.

n_dimensions : int, required

Number of data dimensions.

covariance_type : string, optional

String describing the type of covariance parameters to use. For now, only ‘diag’ is supported. Defaults to ‘diag’.

min_covar : float, optional

Floor on the diagonal of the covariance matrix to prevent overfitting. Defaults to 1e-3.

Methods

compute_posteriors(X)

Predict posterior probability of data under each Gaussian in the model.

Parameters:

X : array-like, shape = [n_samples, n_dimensions]

Returns:

posteriors : array-like, shape = (n_samples, n_components)

Returns the probability of the sample for each Gaussian (state) in the model.

fit(X, thresh=0.01, n_iter=100, n_init=1, update_params='wmc', init_params='', random_state=None, verbose=None)

Estimate model parameters with the expectation-maximization algorithm.

A initialization step is performed before entering the em algorithm. If you want to avoid this step, set the keyword argument init_params to the empty string ‘’ when creating the GMM object. Likewise, if you would like just to do an initialization, set n_iter=0.

Parameters:

X : array_like, shape (n_samples, n_dimensions)

List of ‘n_samples’ data points. Each row corresponds to a single data point.

thresh : float, optional

Convergence threshold.

n_iter : int, optional

Number of EM iterations to perform.

n_init : int, optional

Number of initializations to perform. the best results is kept

update_params : string, optional

Controls which parameters are updated in the training process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars. Defaults to ‘wmc’.

init_params : string, optional

Controls which parameters are updated in the initialization process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars. Defaults to ‘wmc’.

random_state: numpy.random.RandomState :

verbose: bool, optional :

Whether to print EM iteration information during training

get_covars()

Return current means as numpy array

Returns:

covars : np.ndarray, shape (n_components, n_dimensions)

(for now only diagonal covariance matrices are supported)

get_means()

Return current means as numpy array

Returns:means : np.ndarray, shape (n_components, n_dimensions)
get_weights()

Return current weight vector as numpy array

Returns:weights : np.ndarray, shape (n_components,)
predict(X)

Predict label for data.

Parameters:X : array-like, shape = [n_samples, n_dimensions]
Returns:C : array, shape = (n_samples,)
score(X)

Compute the log probability under the model.

Parameters:

X : array_like, shape (n_samples, n_dimensions)

List of ‘n_samples’ data points. Each row corresponds to a single data point.

Returns:

logprob_Nx1 : array_like, shape (n_samples,)

Log probabilities of each data point in X

score_samples(X, temp_gpu_mem=None)

Return the per-sample likelihood of the data under the model.

Compute the log probability of X under the model and return the posterior probability of each mixture component for each element of X.

Parameters:

X: numpy.ndarray, shape (n_samples, n_dimensions) :

Array of n_samples data points. Each row corresponds to a single data point.

Returns:

logprob_Nx1 : array_like, shape (n_samples,)

Log probabilities of each data point in X.

posteriors : array_like, shape (n_samples, n_components)

Posterior probability of each mixture component for each sample

set_covars(covars)

Set covariance matrices with numpy array

Parameters:

covars: numpy.ndarray, shape (n_components, n_dimensions) :

(for now only diagonal covariance matrices are supported)

set_means(means)

Set mean vectors with numpy array.

Parameters:means: numpy.ndarray, shape (n_components, n_dimensions) :
set_weights(weights)

Set weight vector with numpy array.

Parameters:weights: numpy.ndarray, shape (n_components,) :