Large Charts: Specific Charts#
UMAP Clustering (with Graphistry Graphs)#
Large amounts should use trained models to map full dataset (“fit + transform”)
After UMAP generated, size considerations follow regular Graphistry ones
Discuss with Graphistry staff:
Train & embed in databricks or graphistry, and just load in louie
GPU (remote) for 100K row training sets: Discuss - adding support for new Graphistry GPU endpoint
GPU (local) for 100K row training sets: Discuss - adding support for local GPU workers
CPU (local) for 10K row training sets: Discuss
X Bar, Y Bar#
Single-dimensional (single column) is fast
Two-dimensional (two columns) often too slow
A groupby aggregate runs per bar
Many bars * many sub-bars => explosion!
Graphistry Graphs#
For strong clients with good networks:
< 2M edges
< 400K nodes
Less for weaker clients
Discuss: VDI options
Limit number of attributes, especially strings: Blows up memory