Visualization of High Dimensional Data

Google is doing experiments with visualization of high dimenisonal data. This experiment helps visualize what’s happening in machine learning. It allows coders to see and explore their high-dimensional data. The goal is to eventually make this an open-source tool within TensorFlow, so that any coder can use these visualization techniques to explore their data.

Built by Daniel Smilkov, Fernanda Viégas, Martin Wattenberg, and the Big Picture team at Google:

https://aiexperiments.withgoogle.com/visualizing-high-dimensional-space

This work is based on a method developed by Laurens van der Maaten & Geoffrey Hinton in 2008:
Maaten, L. V. D. & Hinton, G. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 11, 2579-2605, https://www.jmlr.org/papers/v9/vandermaaten08a.html

t-Distributed Stochastic Neighbor Embedding (t-SNE, spoken: Disney) is a (prize-winning) nonlinear technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional data sets into R2 or R3. The technique can be implemented via Barnes-Hut approximations, allowing it to be applied on large real-world datasets (“big data”).

For details please refer directly to:

https://lvdmaaten.github.io/tsne/

Compare this method to our own work on subspace clustering:

https://rd.springer.com/article/10.1007/s40708-016-0043-5

You might also like