Explore data with one line of code

A simple interactive plotting library

1
2
3
Reproduce with:
import pandas as pd
import clusterfun as clt

df = pd.read_csv("https://raw.githubusercontent.com/gietema/clusterfun-data/main/mnist.csv")
clt.scatter(df, x="x", y="y", media="img_path", color="label")
That's it!

Why clusterfun?

  • Lightweight
    As opposed to big data analysis platforms, clusterfun allows you to review data within a minute. All you need is a pandas dataframe with an image/audio location per row. Data can be hosted locally or on AWS S3. No sign up, integration, or data transfer is needed.
  • Flexible
    Clusterfun is a plotting library, which means that you determine what the x- and y-axis represent. This gives you the freedom to plot what you want. The additional benefit of a plot is that it allows you to see how close a data point is to another data point, which is more difficult in an image grid visualisation.
  • Interactive
    As opposed to static plots, clusterfun allows you to hover, select, and zoom in on data slices.