Case study · Spatial understanding

Semantic search over gaussian splats

A street block reconstructed from drone imagery and segmented per class. Pick a concept below, the matching splats light up.

01 / Scene

The viewer

Scene is large, first load takes a moment. Best viewed on desktop.

Loading scene
Drag to orbit · WASD to move · QE up/down · Scroll to zoom
Concepts
02 / Method

How it was built

Drone frames go through structure-from-motion to solve camera poses, then into a 3D gaussian-splat fit that reconstructs the block as a few hundred thousand oriented, coloured gaussians. A text-promptable segmentation model then masks each captured view for a fixed vocabulary of classes. A multi-view lifting step fuses those 2D masks back onto the underlying gaussians, so each splat ends up tagged with the classes it belongs to.

What you see here is one pre-computed scene, not a live pipeline. The viewer is static: the splat file and the per-class index are shipped to the browser, and the interaction is just a lookup that dims everything outside the selected class.

The source is not public yet; a longer write-up will follow.