Clusterfun for Error Analysis

Clusterfun's confusion matrices allow you to visualize the performance of your model. You can use the confusion matrix to understand the distribution of your predictions and compare them to the ground truth.

Creating a confusion matrix is as simple as calling the confusion_matrix function with a dataframe containing the:

  • ground truth labels
  • predicted labels
  • media to display

As an example, below we create a confusion matrix for the CIFAR10 dataset.

Creating this confusion matrix will give you a visual representation of the distribution of your predictions. You can see the result in the plot below:

What do we see? We see clusters of data points that are correctly predicted, and clusters of data points that are incorrectly predicted. By hovering over or by selecting a cluster (by clicking and dragging), we can directly inspect the underlying media (images or audio) that are related to these data points. This way, we can quickly identify patterns in the data that are causing the model to make mistakes.