openTSNE: Extensible, parallel implementations of t-SNE

openTSNE is a modular Python implementation of t-Distributed Stochasitc Neighbor Embedding (t-SNE) [1], a popular dimensionality-reduction algorithm for visualizing high-dimensional data sets. openTSNE incorporates the latest improvements to the t-SNE algorithm, including the ability to add new data points to existing embeddings [2], massive speed improvements [3] [4], enabling t-SNE to scale to millions of data points and various tricks to improve global alignment of the resulting visualizations [5].

Macosko 2015 mouse retina t-SNE embedding

A visualization of 44,808 single cell transcriptomes obtained from the mouse retina [6] embedded using the multiscale kernel trick to better preserve the global aligment of the clusters.

References

[1]Van der Maaten, Laurens, and Hinton, Geoffrey. “Visualizing data using t-SNE”, Journal of Machine Learning Research (2008).
[2]Poličar, Pavlin G., Martin Stražar, and Blaž Zupan. “Embedding to Reference t-SNE Space Addresses Batch Effects in Single-Cell Classification”, Machine Learning (2021).
[3]Van der Maaten, Laurens. “Accelerating t-SNE using tree-based algorithms”, Journal of Machine Learning Research (2014).
[4]Linderman, George C., et al. “Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data”, Nature Methods (2019).
[5]Kobak, Dmitry, and Berens, Philipp. “The art of using t-SNE for single-cell transcriptomics”, Nature Communications (2019).
[6]Macosko, Evan Z., et al. “Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets”, Cell (2015).