🎵 MagentaRT Research API

AI Music Generation API • Real-time streaming • Custom fine-tune support

Research Project

⚙️ Environment variables (optional, but helpful)

You can boot this Space directly into your own finetune by setting the variables below in Settings → Variables and secrets → Variables. If you don't set them, you can still select models at runtime using /model/select from the frontend/API.

Quick start: set these to make a finetune the default on boot:

Those values correspond to the example finetune in this repo (checkpoint_1863001.tgz on top of the large base).

Name What it does Example When to set
MRT_CKPT_REPO Hugging Face repo ID that hosts your finetune checkpoints/assets. thepatch/magenta-ft Set to make this finetune the default on boot.
MRT_CKPT_STEP Checkpoint step number to load on boot. 1863001 Set if you want a specific checkpoint preselected.
MRT_SIZE Base model family used by the finetune (e.g., large). large Set to match the base you finetuned from.
SPACE_MODE Controls readiness behavior: serve (GPU, ready to generate) vs template (CPU template for duplication). If unset, the server auto-detects. serve or template Set for explicit behavior; otherwise it falls back to auto-detection.
Alternative: select a model at runtime via API
curl -X POST https://<your-space>.hf.space/model/select \
  -H 'Content-Type: application/json' \
  -d '{
    "ckpt_repo": "thepatch/magenta-ft",
    "ckpt_step": 1863001,
    "size": "large",
    "prewarm": true
  }'

When you call prewarm:true, the backend performs a bar-aligned warmup before returning, so the first jam starts hot.

Open Realtime Web Tester

📱 App Demo Video

iPhone app generating music in real-time

Overview

This API powers AI music generation using Google's MagentaRT, designed for real-time audio streaming using finetunes hosted on HF. Built for iOS app integration with WebSocket streaming support.

Hardware Requirements: Optimal performance requires an L40S GPU (48GB VRAM) for real-time streaming. L4 24GB almost works but will not achieve real-time performance (if someone knows an optimization that will solve this, please let me know).

Quick Start - WebSocket Streaming

Connect to wss://<your-space>/ws/jam for real-time audio generation:

Start Real-time Generation

{
  "type": "start",
  "mode": "rt",
  "binary_audio": false,
  "params": {
    "styles": "electronic, ambient",
    "style_weights": "1.0, 0.8",
    "temperature": 1.1,
    "topk": 40,
    "guidance_weight": 1.1,
    "pace": "realtime",
    "style_ramp_seconds": 8.0,
    "mean": 0.0,
    "centroid_weights": "0.0, 0.0, 0.0"
  }
}

Update Parameters Live

{
  "type": "update",
  "styles": "jazz, hiphop",
  "style_weights": "1.0, 0.8",
  "temperature": 1.2,
  "topk": 64,
  "guidance_weight": 1.0,
  "mean": 0.2,
  "centroid_weights": "0.1, 0.3, 0.0"
}

Stop Generation

{"type": "stop"}

API Endpoints

POST /generate - Generate 4–8 bars of music with input audio
POST /generate_style - Generate music from style prompts only (experimental)
POST /jam/start - Start continuous jamming session
GET /jam/next - Get next audio chunk from session
POST /jam/consume - Mark chunk as consumed
POST /jam/stop - End jamming session
WEBSOCKET /ws/jam - Real-time streaming interface
POST /model/select - Switch between base and fine-tuned models

Custom Fine-Tuning

Train your own MagentaRT models and use them with this API and the iOS app.

1. Train Your Model

Use the official MagentaRT fine-tuning notebook:

🔗 MagentaRT Fine-tuning Colab

This will create checkpoint folders like:

  • checkpoint_1861001/
  • checkpoint_1862001/
  • And steering assets: cluster_centroids.npy, mean_style_embed.npy

2. Package Checkpoints

Checkpoints must be compressed as .tgz files to preserve .zarray files correctly.

Important: Do not download checkpoint folders directly from Google Drive - the .zarray files won't transfer properly.

Checkpoint Packaging Script

Use this in a Colab cell to properly package your checkpoints:

# Mount Drive to access your trained checkpoints
from google.colab import drive
drive.mount('/content/drive')

# Set the path to your checkpoint folder
CKPT_SRC = '/content/drive/MyDrive/thepatch/checkpoint_1862001'  # Adjust path

# Copy folder to local storage (preserves dotfiles)
!rm -rf /content/checkpoint_1862001
!cp -a "$CKPT_SRC" /content/

# Verify .zarray files are present
!find /content/checkpoint_1862001 -name .zarray | wc -l

# Create properly formatted .tgz archive
!tar -C /content -czf /content/checkpoint_1862001.tgz checkpoint_1862001

# Verify critical files are in the archive
!tar -tzf /content/checkpoint_1862001.tgz | grep -c '.zarray'

# Download the .tgz file
from google.colab import files
files.download('/content/checkpoint_1862001.tgz')

3. Upload to Hugging Face

Create a model repository and upload:

Example Repository: thepatch/magenta-ft
Shows the correct file structure with .tgz files and .npy steering assets in the root directory.

4. Use in the App

In the iOS app's model selector, point to your Hugging Face repository URL. The app will automatically discover available checkpoints and allow switching between them.

Technical Specifications

Note: The /generate_style endpoint is experimental and may not properly adhere to BPM without additional context (considering metronome-based context instead of silence).

Integration with iOS App

This API is designed to work seamlessly with our iOS music generation app:

Deployment

To run your own instance:

  1. Duplicate this Hugging Face Space
  2. Ensure you have access to an L40S GPU
  3. Point your iOS app to the new space URL (e.g., https://your-username-magenta-retry.hf.space)
  4. Upload your fine-tuned models as described above

Support & Contact

This is an active research project. For questions, technical support, or collaboration:

Email: kev@thecollabagepatch.com

Research Status: This project is under active development. Features and API may change. We welcome feedback and contributions from the research community.

Licensing

Built on Google's MagentaRT (Apache 2.0 + CC-BY 4.0). Users are responsible for their generated outputs and ensuring compliance with applicable laws and platform policies.

📖 API Reference Documentation