Neural Analog API endpoint

Stem Splitting API

Split an original, restored, mastered, or stem source into vocals, drums, bass, and other selected stem layouts.

post/create-stems
Get a key
Stem Splitting API

Queue stem splitting for an audio asset or an existing stem.

Use this after the source has finished importing or processing. For a full track, pass an audio_id returned by /from-link or upload flow after GET /status/audio/{audio_id} reports completion. To split an existing stem further, pass stem_id instead.

The Stem Splitting API separates original, upscaled, mastered, or stem sources into vocals, drums, bass, instrumental, and other configured stem layouts for downstream download or mastering.

Source selection rules:

text
- `audio_id`: split the original imported/uploaded track.
- `upscaled_id`: split a restored version of that audio.
- `mastered_id`: split a mastered version of that audio.
- `stem_id`: split an existing stem; this takes priority over `audio_id`.

The response returns creation_id, which identifies the stem split job. Poll GET /status/stem_split/{creation_id} until it is complete. Completed stems can then be downloaded with /download/stem/{stem_id} or packaged with /download-archive.

Split songs into vocals, drums, bass, instrumental, and other stem layouts from original, upscaled, or mastered audio.

Use POST /create-stems to queue stem splitting, then download individual stems or archive them together.

Parameters

x-api-key
optionalheaderstring | null
No description provided.

Request Body

preset
requiredstring

Stem separation layout. Common values are 2_tracks, 4_tracks, 6_tracks, acapella, instrumental, duet, modern_bowed_strings, denoise, denoise_debleed, dereverb, decrowd, phantom_center, and custom for prompt-guided SAM separation.

"acapella""instrumental""2_tracks""4_tracks""6_tracks""5_drums_tracks""2_tracks_drums""custom""lead_back""duet""modern_bowed_strings""mvsep_mega_53""denoise""denoise_debleed""dereverb""decrowd""phantom_center"

Example: "4_tracks"

audio_id
optionalstring | null

Source audio asset to split. Provide either audio_id or stem_id.

Example: "6c62f8e7-02a3-48c0-a5b5-5de87ed9c31a"

stem_id
optionalstring | null

Existing stem to split further. When provided with audio_id, stem_id takes priority.

Example: "abf8a992-1c4e-4935-93f0-197116e77e49"

bit_depth
optionalinteger

Output WAV bit depth for generated stems.

1624

Default: 24

hq_streaming_format
optionalstring

No description provided.

"aac""mp3""flac"

Default: "aac"

prompt
optionalstring | null

Only used when preset is custom. Other stem presets ignore this field. Prompt for SAM separation, for example 'lead vocal' or 'snare and kick'. Long prompts are trimmed before validation.

Example: "lead vocal"

sam_model
optionalstring

Only used when preset is custom. Other stem presets ignore this field. SAM Audio model size used for prompt-guided stem extraction.

"sam-audio-small""sam-audio-base""sam-audio-large"

Default: "sam-audio-base"

Example: "sam-audio-large"

stereo_mode
optionalstring

Only used when preset is custom. Other stem presets ignore this field. Stereo handling mode for separation. mono is fastest, mid_sides preserves center/side detail, and left_right processes channels separately.

"mono""mid_sides""left_right"

Default: "mono"

spans
optionalarray<object> | null

Only used when preset is custom. Other stem presets ignore this field. Optional prompt hint ranges for SAM Audio. Omit to split without span hints.

Example: [{"end":42,"start":12.5}]

selection
optionalobject | null

Optional source region to process. When provided, the backend trims the source to this region before running separation.

Example: {"end":42,"start":12.5}

upscaled_id
optionalstring | null

Restored version to use as the stem separation source.

Example: "d66cf940-bf26-45bb-80f7-332f26b6859a"

mastered_id
optionalstring | null

Mastered version to use as the stem separation source.

Example: "f5db8e4b-2e74-4198-a8de-0c3a398620e9"

restoration_params
optionalobject | null

Only used when you want the stem workflow to queue restoration for generated stems. Ordinary stem splitting ignores this field. The nested restoration preset determines which nested restoration parameters are honored.

Example: {"preset":"universal_enhancer","strength":0.8}

Stem Splitting Models

Use these values in the preset field for POST /create-stems.

4_tracks4 stems

Bass, Drums, Vocals, Others

6_tracks6 stems

Bass, Drums, Vocals, Guitar, Piano, Others

mvsep_mega_5353 stems (MVSEP Mega)

Ultra-granular 53-stems model. Lots of bleeding and duplicates. Silent stems are not returned. For higher quality, use the specialized models. Consumes 3x the processed audio minutes.

custom_acoustic_guitarAcoustic Guitar

Keep only the acoustic guitar

modern_bowed_stringsModern Bowed Strings

Strings, Other

2_tracks2 stems

Vocals (acapella), Instrumental (karaoke version)

lead_backLead / Backing Vocals

Lead vocals, backing vocals

duetDuet Singers

Singer 1, Singer 2

custom_main_solo_singerMain Singer

Keep the main vocals and isolate backing vocals and others

custom_musicMusic Only (No Vocals)

Keep the instruments and remove singing, speech, and background noise

custom_singingSinging Only

Keep the singing voice and remove instruments and ambience

custom_speechSpeech Only

Keep only speech and remove the background

2_tracks_drumsIsolate Drums

No drums, Drums. Special model to separate drums from instrumental. Preserve high fullness in drums and instrumental.

custom_remove_drumsIsolate Any Percussion

Remove cinematic percussions, shakers, cymbals...

5_drums_tracks5 drums tracks

Split drums into Kick, Snare, Toms, Hi-hats, Cymbals

dereverbDereverb

Remove reverberation, delay, and echo

denoiseDenoise

Remove background noise from music or vocal tracks

denoise_debleedDenoise and debleed

Remove background noise and source bleed

decrowdFast Decrowd

Quickly remove crowd noise from live recordings

custom_music_singingHeavy Decrowd

Use SAM Audio Large to keep music and vocals while removing crowd, noise, talking, and audience sounds

phantom_centerKeep Only Center Mono

Extracts the "phantom center", the content that should be mono in a track. Use this for: bass, kick drums, podcast voice. Removes phaser, chorus, or flanger from instrument stems. Good for mixing.

custom_knockingKnocking

Isolate knocking sounds (sound effect extraction)

Example

Python
import os
import requests
response = requests.post(
    "https://api.neuralanalog.com/create-stems",
    headers={"X-API-Key": os.environ["NEURALANALOG_API_KEY"]},
    json={
        "audio_id": "00000000-0000-0000-0000-000000000000",
        "preset": "4_tracks",
    },
)
print(response.json())

Success Response

200Successful Response
status
requiredstring

Status after the stem job is queued.

Example: "processing"

message
requiredstring

Human-readable queueing result.

Example: "Stem separation queued"

audio_id
requiredstring

Source audio asset ID.

Example: "6c62f8e7-02a3-48c0-a5b5-5de87ed9c31a"

user_id
requiredstring

User that owns the job.

Example: "2fe9c052-e34f-43b7-9ad5-5186d31cb7ec"

creation_id
requiredstring

Stem split job ID.

Example: "18d67d83-c9f6-4d7f-bb72-09e40c62671e"

audio_stems
optionalarray<object>

No description provided.

upscaled_audios
optionalarray<object>

No description provided.