Case study · Spatial understanding
A street block reconstructed from drone imagery and segmented per class. Pick a concept below, the matching splats light up.
Scene is large, first load takes a moment. Best viewed on desktop.
Drone frames go through structure-from-motion to solve camera poses, then into a 3D gaussian-splat fit that reconstructs the block as a few hundred thousand oriented, coloured gaussians. A text-promptable segmentation model then masks each captured view for a fixed vocabulary of classes. A multi-view lifting step fuses those 2D masks back onto the underlying gaussians, so each splat ends up tagged with the classes it belongs to.
What you see here is one pre-computed scene, not a live pipeline. The viewer is static: the splat file and the per-class index are shipped to the browser, and the interaction is just a lookup that dims everything outside the selected class.
The source is not public yet; a longer write-up will follow.