StableSAM Docs

StableSAM is an API-first service for SAM 3.1 video and image segmentation, plus SAM 3 single-image human body reconstruction. Video and image segmentation use Meta SAM 3.1 with Object Multiplex for faster multi-object tracking; the 3D body route uses SAM 3 because no SAM 3.1 3D-body endpoint exists yet.

Workflow

Upload the source asset to StableUpload.
For video segmentation, inspect the local file and compute frame count.
POST the StableUpload URL to the matching StableSAM route.
Poll /api/jobs/{jobId} with SIWX until complete.

Prompting

Use object category labels such as person, car, head, or hair, not natural-language descriptions.
Each segmentation request returns one detection target. Do not use comma-separated prompts for multi-object segmentation.
Text prompts and point prompts are mutually exclusive. If text is present, point prompts are ignored.
Omit empty pointPrompts and boxPrompts arrays from request bodies.

Limits

Video segmentation max frame count: 960
Video asset max size: 100 MB
Image asset max size: 25 MB
Image routes accept image/jpeg, image/png, and image/webp
Segmented video output is silent and does not preserve the source audio track
applyMask: true returns an RGB video with a black background, not a video with an alpha channel
Image segmentation with applyMask: true returns a PNG mask asset; use result.primaryMask.url from the job result

Video compositing guidance

Use applyMask: true for preview or inspection when you just want the isolated subject on black.
Do not chroma-key the black background from applyMask: true for production compositing. Dark subject regions will be cut out too.
If you need a compositing-ready result, use applyMask: false to get the binary mask video, then combine that mask with the original video using a workflow such as ffmpeg alphamerge before overlaying.

Create a transparent subject video

ffmpeg -i input.mp4 -i mask.mp4 \
  -filter_complex "[1:v]format=gray[mask];[0:v][mask]alphamerge,format=yuva420p" \
  -an -c:v libvpx-vp9 -pix_fmt yuva420p subject-alpha.webm

Composite onto a background video

ffmpeg -i background.mp4 -i subject-alpha.webm \
  -filter_complex "[0:v][1:v]overlay=shortest=1:format=auto" \
  -c:v libx264 -pix_fmt yuv420p composited.mp4

Pricing

POST /api/segment uses dynamic pricing from declared frame count. Fal posts $0.005 per 16 frames and StableSAM applies a 2x markup with a minimum charge.
POST /api/segment-image is fixed at $0.03
POST /api/reconstruct-body-3d is fixed at $0.06

Endpoints

POST /api/segment — paid video segmentation route
POST /api/segment-image — paid image segmentation route
POST /api/reconstruct-body-3d — paid 3D body reconstruction route
GET /api/jobs — SIWX list route
GET /api/jobs/{jobId} — SIWX status route
DELETE /api/jobs/{jobId} — SIWX soft-delete route

Example requests

Video segmentation

{
  "type": "sam-3-video-segment",
  "videoUrl": "https://f.stableupload.dev/abc123/clip.mp4",
  "declaredFrameCount": 320,
  "prompt": "person",
  "applyMask": true,
  "videoOutputType": "mp4",
  "detectionThreshold": 0.5
}

Image segmentation

{
  "type": "sam-3-image-segment",
  "imageUrl": "https://f.stableupload.dev/abc123/photo.jpg",
  "prompt": "person",
  "applyMask": true
}

3D body reconstruction

{
  "type": "sam-3-body-3d",
  "imageUrl": "https://f.stableupload.dev/abc123/person.jpg"
}