Bygheart Music — Songwriting Guide

Models & credits

Three DiT models live on the GPU at all times. Composer Lab piggybacks on Studio Pro until the 4th handler ships.

Label	Backend ID (alias)	Best for	Steps	Credits
Bygheart Melody Fast	`bygheart-melody-fast`	Quick iteration / drafting	8–16	1
Bygheart Studio Pro	`bygheart-studio-pro`	Recommended daily driver	8	2
Bygheart Studio Max	`bygheart-studio-max`	Highest detail, 4B LM thinking	50	4
Bygheart Composer Lab	`bygheart-composer-lab`	Edit / repaint workflows	50	4

Pro plan: 200 credits/month, 30/day cap, up to 60 unused credits roll over. Failed generations are refunded automatically. Ultimate: unlimited.

Modes

Simple

Short prompt only. Great for instrumentals (toggle Instrumental) and quick vibes. The LM writes lyrics for you if you leave them off.

Custom

Full prompt + your exact lyrics. The system preserves your lyrics verbatim unless you turn on Lyric-aware planning or Enhance prompt/lyrics.

Remix (cover)

Upload a reference audio + your prompt. Re-imagines the song with the new style. Tune Cover Strength (how much to follow the source) and Remix Strength (how much creative noise).

Edit (repaint)

Upload audio + new lyrics. Repaints sections of the source. Use Repaint mode (strict / balanced / loose) and Crossfade for clean section transitions.

Song Description (prompt)

The single most important control. Pack it with what you actually want to hear: genre, mood, vocal style, instruments, mix quality, hook size, energy. Avoid generic words.

Strong example

epic modern pop, dark dance pop, cinematic trap-pop, stadium anthem, huge hook, distorted bass, choir layers, glossy radio production, powerful female lead vocal, stacked harmonies, explosive chorus, dramatic bridge, punchy drums, wide synths, polished high-energy mix

Weak example (too generic)

pop song, female vocals

Genre stack — list 2-3 genres separated by commas. The model averages them.

Mood / energy — words like cinematic, melancholic, euphoric, gritty, intimate, anthemic.

Production texture — glossy, lo-fi, analog, grainy, overdriven, polished, sparse, dense.

Mix descriptors — punchy snare, sub-bass, wide stereo, sidechain pump, crunchy guitar, shimmer reverb.

Hook hints — huge hook, drop into half-time, vocal chop loop, stadium "whoa-oh" gang vocals.

Pro tip: write the description like you’re briefing a producer. Specifics beat adjectives.

Negative styles

Comma-separated list of what to avoid. Set it in Advanced Options → Negative styles.

country, acoustic folk, screamo, metalcore, lo-fi bedroom pop, muddy mix, weak vocals, thin drums, pitch wobble, wrong language

A good default if you’re going for radio-quality pop: muddy mix, weak vocals, thin drums, gibberish.

Lyrics structure

Lyrics are the song’s timeline. Section tags drive arrangement. Keep production/mood mostly in the prompt; reserve in-line vibe hints for transition moments.

Section tags the model understands

[Intro] — instrumental opening

[Verse] / [Verse 1]

[Pre-Chorus]

[Chorus]

[Post-Chorus]

[Bridge]

[Final Chorus]

[Outro]

[Drop]

[Build] / [Breakdown]

[Hook]

[Refrain]

Add a short hint in the tag for a vibe shift: [Bridge - stripped, emotional] or [Final Chorus - stadium ending].

Recommended template

[Intro]

[Verse 1]

[Pre-Chorus]

[Chorus]

[Verse 2]

[Pre-Chorus]

[Chorus]

[Bridge]

[Final Chorus]

[Outro]

Vocal style

ACE-Step has no hard gender or style switch. The Studio’s Voice + Delivery dropdowns inject phrases into your prompt and lyric tags so the model is biased toward that style. They’re strong suggestions, not guarantees — back them up in the Song Description.

Voice

Setting	Caption phrase added to prompt
Auto	(none — model decides)
Female	female lead vocal
Male	male lead vocal
Female + Male duet	female and male duet vocals
Choir	full choir vocals
Instrumental	instrumental, no vocals (also clears lyrics)

Delivery

Setting	Caption phrase	Lyric tag injected
Sung pop	melodic sung pop delivery	—
Rap	rap hip-hop flow, spoken-word delivery	`spoken word`
R&B	soulful R&B delivery with vocal runs	—
Country twang	country vocal twang	—
Rock belt	rock belt vocal	—
Folk acoustic	folk acoustic vocal	—
Whispered	intimate whispered vocal	`whispered`
Spoken word	spoken word vocal	`spoken word`
Falsetto	falsetto vocal	`falsetto`
Raspy	raspy gritty vocal	`raspy vocal`
Powerful belting	powerful belting vocal	`powerful belting`

In-lyric performance tags

Drop these inline in your lyrics or tag a section with them. They tell the model how to perform that line/section.

[whispered]

[falsetto]

[powerful belting]

[spoken word]

[harmonies]

[ad-lib]

[call and response]

[gang vocals]

[backing vocals]

[choir]

[shouted]

[breath]

Duration, BPM, Key, Time signature

Control	What it does	Recommendation
Duration	Total seconds (10–600)	Idea: 30–60s · Verse+Chorus: 90–150s · Full song with bridge: 210–300s · Auto = inferred from lyrics
BPM	Beats per minute	Auto unless you know the target. Pop anthem 110–140, ballad 70–90, drill 140 half-time, house 124–128.
Key	Musical key + mode	Auto for most cases. Minor keys for dramatic, major for uplift.
Time signature	Meter	Default 4/4. Use 3/4 or 6/8 for waltz / ballad feel.

Expert controls (Advanced Options)

Defaults work fine. These are for fine-tuning when you know what you’re doing.

Control	Backend param	Range / default	Effect
Global caption	`global_caption`	free text	Optional overall description fed alongside the prompt. Use for high-level vibe.
Track name	`track_name`	free text	Internal label for the track. Doesn’t affect audio.
Inference steps override	`inference_steps`	2–200, default per-model	More steps = more detail, slower. Fast: 8–16. Pro: 8. Max/Lab: 50.
Flow shift	`shift`	0.5–10.0, default 3.0	Flow-matching schedule shift. Higher = more weight on early steps (composition). Lower = more weight on detail.
Guidance scale	`guidance_scale`	0.5–15, default 1.0	How strictly to follow the prompt. Higher = more on-prompt but stiffer; lower = more creative.
ADG (anti-bias)	`use_adg`	auto/on/off	Anti-bias diffusion guidance. Sometimes improves coherence on long songs.
CFG window start/end	`cfg_interval_start` / `cfg_interval_end`	0.0–1.0, defaults 0.0/1.0	Limit when guidance is applied during the diffusion. Narrower window = looser overall.
Tiled VAE decode	`use_tiled_decode`	on (default) / off	Tile the VAE during decode for lower VRAM. Off slightly faster on big GPUs.
CoT caption	`use_cot_caption`	on (default) / off	Chain-of-thought caption expansion via the LM. Off if you want the prompt verbatim.
CoT language detect	`use_cot_language`	on (default) / off	LM auto-detects the lyric language. Off if you’re forcing a specific language.
LM top_p	`lm_top_p`	0–1	Nucleus sampling for the lyrics-planning LM (Studio Max thinking).
LM top_k	`lm_top_k`	0–200	Top-k sampling for the LM.
LM repetition penalty	`lm_repetition_penalty`	0.5–3.0, default 1.0	Penalize the LM for repeating tokens. Bump to ~1.1 if lyrics get loopy.
Creativity slider	`lm_temperature` / `lm_cfg_scale`	0–100%	Maps to LM temperature (0.45–1.0) and inverse CFG scale. Higher slider = wilder lyrics.
Negative styles	`lm_negative_prompt`	free text	What to push the LM/DiT away from. Use commas.

Edit / Repaint mode

Upload a song and rewrite parts of it (lyrics or section). The repaint engine generates new audio for masked sections and crossfades it back into the source.

Control	Backend param	What it does
Repaint Strength	`repaint_strength`	How much to alter the source. 0 = barely change, 1 = total rewrite.
Repaint mode	`repaint_mode`	`strict` (preserve harmony), `balanced` (default), `loose` (more creative)
Crossfade (sec)	`repaint_wav_crossfade_sec`	Wav-domain crossfade length. 0.0 default; bump to 0.3–0.8 for smoother joins on rough cuts.
Repainting start/end	`repainting_start` / `repainting_end`	Time window in source to repaint. `-1` = full song.

Remix / Cover mode

Re-imagine a song in a new style. Upload a reference and write a Song Description for the target vibe.

Control	Backend param	What it does
Cover Strength	`audio_cover_strength`	How tightly to follow the source structure (melody, timing). Lower = looser remix.
Remix Strength	`cover_noise_strength`	Creative noise injected on top. Higher = wilder departure from the source.

Full examples

A. Modern pop anthem (Custom mode, Studio Pro)

Song Description

epic modern pop, dark dance pop, cinematic trap-pop, stadium anthem, huge hook, distorted bass, choir layers, glossy radio production, powerful female lead vocal, stacked harmonies, explosive chorus, dramatic bridge, punchy drums, wide synths, polished high-energy mix

Lyrics

[Intro - dark synth pulse]

Midnight
Gold light
Black car
White lies

[Verse 1 - tight modern pop flow]

I got a new face in the rearview
Old me tried to call, I hit decline
Heart in my chest like a loaded weapon
Aiming at the version I left behind

[Pre-Chorus - rising]

Tell the angels I’m unavailable
Tell my demons they can wait in line
Got a thousand voices in my head tonight
But the only one I’m hearing is mine

[Chorus - huge, anthemic]

I go god mode after midnight
Hands up when the lightning hits
Diamonds in the dirt, baby that’s my life
I’m the storm, I’m the calm, I’m the lit fuse fire

[Bridge - stripped, emotional]

I used to beg for a little mercy
Now mercy looks like me
I used to fold like paper
Now I burn the whole damn page

[Final Chorus - massive stadium ending]

I go god mode after midnight
Whole world watching me ignite
Every scar a spotlight, every tear a strobe
I’m the storm, I’m the calm, I’m the lit fuse fire

[Outro - choir and vocal chop fade]

Midnight
Gold light
Black car
White lies

Settings

Mode: Custom · Model: Bygheart Studio Pro
Voice: Female · Delivery: Powerful belting
Duration: 210s · Tempo: 124 · Key: F Minor · Time: 4/4
Negative styles: country, acoustic folk, muddy mix, weak vocals

B. Lo-fi instrumental (Simple mode, Fast)

Prompt: lo-fi jazzhop, mellow piano, warm vinyl crackle, dusty drum break, sub bass, instrumental

Settings: Simple · Fast · Instrumental ON · Duration 90s · Tempo 80 · Key C Major

C. Spoken-word rap (Custom, Studio Max)

Prompt: hard-hitting drill, dark trap, knocking 808s, hypnotic flute, gritty male rap, spoken word delivery

Voice: Male · Delivery: Rap (auto-injects [spoken word])

Negative styles: singing, melodic vocals, soft pop

Troubleshooting

Symptom	Likely cause	Fix
Wrong language vocals	Vocal language mismatch	Set Language to your target. Add to negative: `non-English vocals, translated lyrics`.
Lyrics rewritten	Sample mode or Enhance is on	Use Custom mode, turn off Enhance prompt/lyrics, don’t use the Sample button.
Female voice when you wanted male (or vice versa)	ACE-Step doesn’t hard-switch; vocal style is a hint	Re-emphasize in Song Description (`raspy male lead vocal, low register`) and add to negative: `female vocals, high register`. Try multiple seeds.
Muddy or weak mix	Underspecified prompt	Add concrete production words: `punchy snare, sub-bass, wide stereo, polished mix`. Add to negative: `muddy mix, weak vocals`.
Static or 0-second audio	Backend hiccup or task expired	Try again. Check Library → Pending. Refresh the page if a task is stuck >10 minutes.
Daily limit hit	30/day cap on Pro	Wait until midnight UTC, switch to Fast (1 credit), or upgrade to Ultimate.
Quota exceeded	Used your 200 + rollover	Wait for monthly reset (1st of next month) or upgrade.

API reference

Pro+ users can generate songs programmatically. Get a key at /api-keys.html. Quota is shared with the Studio. Failed generations are auto-refunded.

POST /bygheartsong/api/release_task

curl -X POST https://iamvision.tech/bygheartsong/api/release_task \
  -H "Authorization: Bearer bgh_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bygheart-studio-pro",
    "prompt": "epic modern pop, female belt vocal, glossy mix",
    "lyrics": "[Verse]\n...\n[Chorus]\n...",
    "audio_duration": 180,
    "audio_format": "mp3",
    "vocal_language": "en",
    "use_random_seed": true,
    "task_type": "text2music",
    "lm_negative_prompt": "muddy mix, weak vocals"
  }'

All accepted fields

Bold = required-ish for a useful song. Italics = advanced.

Field	Type	Default	Notes
model	string	Fast	One of the bygheart-* aliases above.
prompt	string	—	Song Description.
lyrics	string	"" = LM-generated	Use `[Section]` tags + blank lines.
audio_duration	float (sec)	auto	10–600.
vocal_language	string	"en"	en/zh/ja/es/ko
audio_format	string	"mp3"	mp3/wav/flac
batch_size	int	1	1/2/4 — generates N variations.
seed	int	-1	-1 = random. Set + use_random_seed=false to reproduce.
use_random_seed	bool	true
bpm	int\|null	null
key_scale	string	""	e.g. "C Major"
time_signature	string	""	e.g. "4/4"
thinking	bool	false	Enable LM lyric planning. Auto-on for Studio Max.
use_format	bool	false	Enhance prompt/lyrics via LM.
sample_mode	bool	false	Used by the Sample button.
task_type	string	"text2music"	"text2music" / "cover" (remix) / "repaint" (edit)
lm_model_path	string	"acestep-5Hz-lm-1.7B"	1.7B fast, 4B for Studio Max thinking.
lm_temperature	float	~0.78	LM creativity.
lm_cfg_scale	float	~2.4	LM guidance.
lm_top_p	float	—
lm_top_k	int	—
lm_repetition_penalty	float	1.0
lm_negative_prompt	string	—	Comma-separated negatives.
guidance_scale	float	1.0	DiT classifier-free guidance.
shift	float	3.0	Flow-matching shift.
cfg_interval_start, cfg_interval_end	float	0.0, 1.0
use_adg	bool	—	Anti-bias guidance.
use_tiled_decode	bool	true	VAE tiling.
use_cot_caption, use_cot_language	bool	true	LM CoT helpers.
global_caption	string	—	Optional overall description.
track_name	string	—	Internal label.
audio_cover_strength	float	1.0	Remix only.
cover_noise_strength	float	0.0	Remix only.
repaint_strength	float	0.5	Edit only.
repaint_mode	string	"balanced"	strict / balanced / loose
repaint_wav_crossfade_sec	float	0.0	Edit only.
repainting_start, repainting_end	float	0.0, -1	Edit only.
reference_audio	file	—	multipart upload (Remix mode).
src_audio	file	—	multipart upload (Remix / Edit modes).

POST /bygheartsong/api/query_result

curl -X POST https://iamvision.tech/bygheartsong/api/query_result \
  -H "Authorization: Bearer bgh_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "task_id_list": ["TASK_ID_FROM_RELEASE"] }'

GET /bygheartsong/api/v1/audio?path=...

curl -L -H "Authorization: Bearer bgh_YOUR_KEY" \
  "https://iamvision.tech/bygheartsong/api/v1/audio?path=/abs/path/from/result.mp3" \
  -o song.mp3

Errors

Code	Body.error	Meaning
401	`unauthenticated`	Missing / invalid `bgh_` key.
403	`pro_tier_required`	Free tier blocked. Upgrade.
429	`quota_exceeded`	Monthly credits used up.
429	`daily_limit_exceeded`	Hit today’s 30-credit cap.
502	`quota_check_failed`	Supabase RPC unreachable. Retry.