Initialization

openTSNE.initialization.jitter(x, inplace=False, scale=0.01, random_state=None)[source]

Add jitter with small standard deviation to avoid numerical problems when the points overlap exactly.

Parameters:

x (np.ndarray)
inplace (bool)
scale (float)
random_state (int or np.random.RandomState)

Returns:

A jittered version of x.

Return type:

np.ndarray

openTSNE.initialization.pca(X, n_components=2, svd_solver='auto', random_state=None, verbose=False, add_jitter=True)[source]

Initialize an embedding using the top principal components.

Parameters:

X (np.ndarray) – The data matrix.
n_components (int) – The dimension of the embedding space.
svd_solver (str) – See sklearn.decomposition.PCA documentation.
random_state (Union[int, RandomState]) – If the value is an int, random_state is the seed used by the random number generator. If the value is a RandomState instance, then it will be used as the random number generator. If the value is None, the random number generator is the RandomState instance used by np.random.
verbose (bool)
add_jitter (bool) – If True, jitter with small standard deviation is added to the initialization to prevent points overlapping exactly, which may lead to numerical issues during optimization.

Returns:

initialization

Return type:

np.ndarray

openTSNE.initialization.random(n_samples, n_components=2, random_state=None, verbose=False)[source]

Initialize an embedding using samples from an isotropic Gaussian.

Parameters:

n_samples (Union[int, np.ndarray]) – The number of samples. Also accepts a data matrix.
n_components (int) – The dimension of the embedding space.
random_state (Union[int, RandomState]) – If the value is an int, random_state is the seed used by the random number generator. If the value is a RandomState instance, then it will be used as the random number generator. If the value is None, the random number generator is the RandomState instance used by np.random.
verbose (bool)

Returns:

initialization

Return type:

np.ndarray

openTSNE.initialization.rescale(x, inplace=False, target_std=0.0001)[source]

Rescale an embedding so optimization will not have convergence issues.

Parameters:

x (np.ndarray)
inplace (bool)
target_std (float)

Returns:

A scaled-down version of x.

Return type:

np.ndarray

openTSNE.initialization.spectral(A, n_components=2, tol=0.0001, max_iter=None, random_state=None, verbose=False, add_jitter=True)[source]

Initialize an embedding using the spectral embedding of the KNN graph.

Specifically, we initialize data points by computing the diffusion map on the random walk transition matrix of the weighted graph given by the affiniy matrix.

Parameters:

A (Union[sp.csr_matrix, sp.csc_matrix, ...]) – The graph adjacency matrix.
n_components (int) – The dimension of the embedding space.
tol (float) – See scipy.sparse.linalg.eigsh documentation.
max_iter (float) – See scipy.sparse.linalg.eigsh documentation.
random_state (Any) – If the value is an int, random_state is the seed used by the random number generator. If the value is a RandomState instance, then it will be used as the random number generator. If the value is None, the random number generator is the RandomState instance used by np.random.
add_jitter (bool) – If True, jitter with small standard deviation is added to the initialization to prevent points overlapping exactly, which may lead to numerical issues during optimization.
verbose (bool)

Returns:

initialization

Return type:

np.ndarray