Skip to main content

Video-to-3D Reconstruction

Upload a video and reconstruct a 3D scene from it using a multi-agent ML pipeline.

Status: Scaffolded. Cloud Function startVideoReconstruction creates job docs; the actual ML orchestrator is not yet built.

Pipeline Stages

StageML ModelPurpose
segmentSAM2Segment video frames into objects
depthMonocular depth estimationEstimate per-pixel depth
classifyObject classificationIdentify what each segment is
rigAuto-riggingAdd armatures to characters/vehicles
composeScene composerPlace objects in 3D space

Job Lifecycle

uploaded → segmenting → depth → classifying → rigging → composing → done
└→ failed

Each stage updates the videoJobs/{jobId} Firestore document. The VideoPanel subscribes via onSnapshot and shows real-time progress bars.

UI

The VideoPanel (accessible via the bottom-left tabbed drawer) provides:

  • File input for video upload
  • Job list with per-stage progress indicators
  • "Reconstruct" button to kick off the pipeline
  • Result GLB download when complete