Explore data with one line of code

wiki-art
mnist
cifar10
1
2
3
Clusterfun is an open-source python plotting library to explore images in your plots.
After installing clusterfun with pip install clusterfun, you can create the above plot locally, without any additional setup:
import pandas as pd
import clusterfun as clt

df = pd.read_csv("https://raw.githubusercontent.com/gietema/clusterfun-data/main/mnist.csv")
clt.scatter(df, x="x", y="y", media="img_path", color="label")

Why clusterfun?

  • Lightweight
    As opposed to big data analysis platforms (Scale, Encord), clusterfun allows you to review data within a minute. All you need is a pandas dataframe with an image location per row. Data can be hosted locally or on AWS S3. No sign up, integration, or data transfer is needed.
  • Flexible
    Clusterfun is a plotting library, which means that you determine what the x- and y-axis represent. This gives you the freedom to plot what you want. The additional benefit of a plot is that it allows you to see how close a data point is to another data point, which is more difficult in an image grid visualisation.
  • Interactive
    As opposed to static plots, clusterfun allows you to hover, select, and zoom in on data slices.