Explainable structuring and discovery of relevant cases for exploration of high-dimensional data

J Falip, F Blanchard, M Herbin

March 2019

Structured elements.

Abstract

Data described by numerous features create a challenge for domain experts as it is difficult to manipulate, explore and visualize them. With the increased number of features, a phenomenom called “curse of dimensionality” arises: sparsity increases and distance metrics are less relevant as most elements of the dataset become equidistant. The result is a loss of efficiency for traditional machine learning algorithms. Moreover, many state-of-the-art approaches act as black-boxes from a user point of view and are unable to provide explanations for their results. We propose an instance-based method to structure datasets around important elements called exemplars. The similarity measure used by our approach is less sensitive to high-dimensional spaces, and provides both explainable and interpretable results: important properties for decision-making tools such as recommender systems. The described algorithm relies on exemplar theory to provide a data exploration tool suited to the reasoning used by experts of various fields. We apply our method to synthetic as well as real-world datasets and compare the results to recommendations made using a nearest neighbor approach.

Type

Conference paper

Publication

Exploratory Search and Interactive Data Analytics (ESIDA) - Intelligent User Interfaces (IUI)