pip install clusterfun
import pandas as pd
import clusterfun as clt
df = pd.read_csv("https://raw.githubusercontent.com/gietema/clusterfun-data/main/wiki-art.csv")
clt.scatter(df, x="x", y="y", media="img_path", color="painter")
seaborn
or the plotly
library. But in clusterfun, you can:Clusterfun supports AWS S3 and local data storage and loading. The media column in the dataframe will be used to determine where to load the media from. S3 media should start with s3://
.
Make sure to set a AWS_REGION environment variable to the region where your data is stored. Support for Google Cloud Storage is coming soon.
You can color different categories with the color
parameter.
You can visualise bounding boxes on top of your images by with the bounding_box
parameter. For this to work, you need to have a bounding box column in the dataframe used to plot the data. Each cell in the dataframe needs to contain a dictionary or a list of dictionaries with bounding box values: xmin, ymin, xmax, ymax, label (optional), color (optional)
.
Example of a bounding box:
single_bounding_box = {
"xmin": 12,
"ymin": 10,
"xmax": 100,
"ymax": 110,
"color": "green",
"label": "ground truth"
}
color
can be either a color name or hex value