GPU/CUDAMat Backend¶
GPU/CUDAMat backend for GMM training and inference
Example Usage¶
Training a GMM:
import ggmm.gpu as ggmm
X = some_module.load_training_data()
# N - training examples
# D - data dimension
# K - number of GMM components
N, D = X.shape
K = 128
ggmm.init()
gmm = ggmm.GMM(K,D)
thresh = 1e-3 # convergence threshold
n_iter = 20 # maximum number of EM iterations
init_params = 'wmc' # initialize weights, means, and covariances
# train GMM
gmm.fit(X, thresh, n_iter, init_params=init_params)
# retrieve parameters from trained GMM
weights = gmm.get_weights()
means = gmm.get_means()
covars = gmm.get_covars()
# compute posteriors of data
posteriors = gmm.compute_posteriors(X)
Reference¶
- ggmm.gpu.init(max_ones=262144)¶
Initialize GPU resources.
Parameters: max_ones : int, optional
Allocate enough memory for a sum of up to ‘max_ones’ length
- ggmm.gpu.shutdown()¶
Free GPU resources
- class ggmm.gpu.GMM(n_components, n_dimensions, covariance_type='diag', min_covar=0.001, verbose=False)¶
Gaussian Mixture Model
Representation of a Gaussian mixture model probability distribution. This class allows for easy evaluation of, sampling from, and maximum-likelihood estimation of the parameters of a GMM distribution.
Initializes parameters such that every mixture component has zero mean and identity covariance.
Parameters: n_components : int, required
Number of mixture components.
n_dimensions : int, required
Number of data dimensions.
covariance_type : string, optional
String describing the type of covariance parameters to use. For now, only ‘diag’ is supported. Defaults to ‘diag’.
min_covar : float, optional
Floor on the diagonal of the covariance matrix to prevent overfitting. Defaults to 1e-3.
Methods
- compute_posteriors(X)¶
Predict posterior probability of data under each Gaussian in the model.
Parameters: X : array-like, shape = [n_samples, n_dimensions]
Returns: posteriors : array-like, shape = (n_samples, n_components)
Returns the probability of the sample for each Gaussian (state) in the model.
- fit(X, thresh=0.01, n_iter=100, n_init=1, update_params='wmc', init_params='', random_state=None, verbose=None)¶
Estimate model parameters with the expectation-maximization algorithm.
A initialization step is performed before entering the em algorithm. If you want to avoid this step, set the keyword argument init_params to the empty string ‘’ when creating the GMM object. Likewise, if you would like just to do an initialization, set n_iter=0.
Parameters: X : array_like, shape (n_samples, n_dimensions)
List of ‘n_samples’ data points. Each row corresponds to a single data point.
thresh : float, optional
Convergence threshold.
n_iter : int, optional
Number of EM iterations to perform.
n_init : int, optional
Number of initializations to perform. the best results is kept
update_params : string, optional
Controls which parameters are updated in the training process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars. Defaults to ‘wmc’.
init_params : string, optional
Controls which parameters are updated in the initialization process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars. Defaults to ‘wmc’.
random_state: numpy.random.RandomState :
verbose: bool, optional :
Whether to print EM iteration information during training
- get_covars()¶
Return current means as numpy array
Returns: covars : np.ndarray, shape (n_components, n_dimensions)
(for now only diagonal covariance matrices are supported)
- get_means()¶
Return current means as numpy array
Returns: means : np.ndarray, shape (n_components, n_dimensions)
- get_weights()¶
Return current weight vector as numpy array
Returns: weights : np.ndarray, shape (n_components,)
- predict(X)¶
Predict label for data.
Parameters: X : array-like, shape = [n_samples, n_dimensions] Returns: C : array, shape = (n_samples,)
- score(X)¶
Compute the log probability under the model.
Parameters: X : array_like, shape (n_samples, n_dimensions)
List of ‘n_samples’ data points. Each row corresponds to a single data point.
Returns: logprob_Nx1 : array_like, shape (n_samples,)
Log probabilities of each data point in X
- score_samples(X, temp_gpu_mem=None)¶
Return the per-sample likelihood of the data under the model.
Compute the log probability of X under the model and return the posterior probability of each mixture component for each element of X.
Parameters: X: numpy.ndarray, shape (n_samples, n_dimensions) :
Array of n_samples data points. Each row corresponds to a single data point.
Returns: logprob_Nx1 : array_like, shape (n_samples,)
Log probabilities of each data point in X.
posteriors : array_like, shape (n_samples, n_components)
Posterior probability of each mixture component for each sample
- set_covars(covars)¶
Set covariance matrices with numpy array
Parameters: covars: numpy.ndarray, shape (n_components, n_dimensions) :
(for now only diagonal covariance matrices are supported)
- set_means(means)¶
Set mean vectors with numpy array.
Parameters: means: numpy.ndarray, shape (n_components, n_dimensions) :
- set_weights(weights)¶
Set weight vector with numpy array.
Parameters: weights: numpy.ndarray, shape (n_components,) :