SAM3

SAM3 Video Tracking · powered by Hugging Face 🤗 Transformers

Segment and track objects across a video with SAM3 (Segment Anything 3). This demo runs the official implementation from the Hugging Face Transformers library for interactive, promptable video segmentation with point, box, and text prompts.

Quick start

Load a video: Upload your own or pick an example below.
Select a frame and enter text description(s) to segment objects (e.g., "red car", "penguin"). You can add multiple prompts separated by commas (e.g., "person, bed, lamp") or add them one by one. The text prompt will return all the instances of the object in the frame and not specific ones (e.g. not "penguin on the left" but "penguin").

Working with results

Preview: Use the slider to navigate frames and see the current masks.
Propagate: Click "Propagate across video" to track all defined objects through the entire video.
Export: Render an MP4 for smooth playback using the original video FPS.

Upload video

Preview

Frame

0 0

Text Prompt(s)

Active prompts: None

Rendered Playback

Examples