Marigold Model Registry -- Qwen, Mistral, DeepSeek API

depth

facebook/dpt-dinov2-small-kitti

Monocular depth estimation model from Meta. Estimates per-pixel depth from a single RGB image. Trained on KITTI driving data. Good for scene geometry estimation and 3D-aware applications.

image → depth

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "depth",
    "model_name": "facebook/dpt-dinov2-small-kitti",
    "input": "..."
  }'

http

HTTP tool worker. Executes outbound HTTP requests as workflow steps. Used for webhook delivery, external API calls, and URL fetching within multi-step pipelines.

json → json

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "http",
    "model_name": "http",
    "input": "..."
  }'

image embedding

google/siglip-base-patch16-224

Vision-language embedding model from Google. Good for image-text similarity, cross-modal retrieval, and visual content classification. Base variant; faster and lighter than the so400m variant. 768-dimensional output.

image → vector/768

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "image-embedding",
    "model_name": "google/siglip-base-patch16-224",
    "input": ""
  }'

google/siglip2-so400m-patch14-224

Large vision-language embedding model from Google. Higher quality than the base variant for image-text alignment and cross-modal retrieval tasks. 1152-dimensional output. Suited for quality-gated image generation pipelines.

image → vector/1152

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "image-embedding",
    "model_name": "google/siglip2-so400m-patch14-224",
    "input": ""
  }'

image eval

cafeai/cafe_aesthetic

Aesthetic quality scoring model for images. Rates visual appeal on a continuous scale. Used in quality-gated image generation pipelines to filter low-aesthetic outputs.

image → json

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "image-eval",
    "model_name": "cafeai/cafe_aesthetic",
    "input": "..."
  }'

falconsai/nsfw_image_detection

NSFW image classification model. Detects inappropriate or explicit content in images. Used in content moderation pipelines and safety-gated image generation outputs.

image → json

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "image-eval",
    "model_name": "falconsai/nsfw_image_detection",
    "input": "..."
  }'

image text eval

clip-vit-b-32

CLIP ViT-B/32 image-text alignment model. Lightweight variant for computing image-text similarity scores. Used for prompt alignment evaluation in generation quality pipelines.

image+text → json

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "image-text-eval",
    "model_name": "clip-vit-b-32",
    "input": "..."
  }'

openai/clip-vit-large-patch14

CLIP ViT-L/14 from OpenAI. Computes similarity scores between images and text descriptions. Used in quality-gated image generation pipelines to score output against the original prompt.

image+text → scores

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "image-text-eval",
    "model_name": "openai/clip-vit-large-patch14",
    "input": "..."
  }'

img2mask

facebook/sam-vit-huge

Segment Anything Model (SAM) ViT-Huge from Meta. Produces high-quality segmentation masks from image prompts (points, boxes, or automatic). Requires GPU for practical inference speed.

image → mask

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "img2mask",
    "model_name": "facebook/sam-vit-huge",
    "input": "..."
  }'

instruct

deepseek-ai/deepseek-r1-distill-llama-8b

8B reasoning model distilled from DeepSeek-R1 into the Llama-3 base. Applies extended chain-of-thought before responding. Strong on complex reasoning, coding, and mathematical tasks. Extended KV cache growth during reasoning requires additional memory headroom. MIT licence.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-ai/deepseek-r1-distill-llama-8b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

deepseek-ai/deepseek-r1-distill-qwen-7b

7B reasoning model distilled from DeepSeek-R1 into the Qwen2.5 base. Applies chain-of-thought reasoning before responding. Strong on multi-step problem solving and mathematical reasoning. MIT licence.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-ai/deepseek-r1-distill-qwen-7b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

google/gemma-3-12b-it

Gemma 3 12B instruction model from Google. Strong reasoning, coding, and instruction following. 128K context window and multilingual support. Requires HuggingFace access approval.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemma-3-12b-it",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

google/gemma-3-27b-it

Gemma 3 27B instruction model from Google. Near-frontier quality at 4-bit quantisation (~14GB weights). 128K context window, multimodal architecture, 140+ languages. Requires HuggingFace access approval.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemma-3-27b-it",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

google/gemma-3-4b-it

Gemma 3 4B instruction model from Google. Good general-purpose chat and reasoning at a small parameter count. Multimodal-capable architecture. Requires HuggingFace access approval.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemma-3-4b-it",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

ibm-granite/granite-3.3-8b-instruct

8B instruction model from IBM. Apache 2.0 licence. Designed for enterprise use including RAG, tool use, and structured output. Strong instruction following with focus on reliability and auditability. Good fit for regulated sector workflows.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ibm-granite/granite-3.3-8b-instruct",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

madeagents/hammer2.1-3b

3B instruction model tuned for agentic tool use and function calling. Good for structured action generation, tool-use pipelines, and lightweight agent tasks.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "madeagents/hammer2.1-3b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

meta-llama/llama-3.1-8b-instruct

Llama 3.1 8B instruction model from Meta. Strong reasoning, instruction following, and tool use capability. Good GPU-tier general-purpose model. Requires HuggingFace access approval.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-3.1-8b-instruct",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

meta-llama/llama-3.2-1b-instruct

Llama 3.2 1B instruction model from Meta. Lightweight and fast on CPU. Good for simple chat, classification, and latency-sensitive tasks where the Llama family is preferred. Requires HuggingFace access approval.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-3.2-1b-instruct",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

microsoft/phi-4

14B instruction model from Microsoft. MIT licence. Strong on reasoning, STEM tasks, and instruction following at 14B scale. Fits on gpu-sm with 4-bit quantisation (~7GB weights, ~10GB total with KV cache).

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft/phi-4",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

mistralai/mistral-7b-instruct-v0.3

7B instruction model from Mistral AI. Strong English reasoning and instruction following. Widely tested and well understood for general-purpose chat and structured output.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistralai/mistral-7b-instruct-v0.3",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

qwen/qwen2-0.5b-instruct

Compact 0.5B instruction-following model from Alibaba. Good for simple chat, classification, and lightweight text tasks where low latency on CPU matters most.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen2-0.5b-instruct",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

qwen/qwen2-1.5b-instruct

1.5B instruction model from Alibaba. Balances speed and capability for general chat and simple reasoning tasks on CPU.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen2-1.5b-instruct",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

qwen/qwen2.5-7b-instruct

7B instruction model from Alibaba. Strong general-purpose reasoning, instruction following, and multilingual capability. Good CPU-tier workhorse for chat and structured output tasks.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen2.5-7b-instruct",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

qwen/qwen2.5-coder-7b-instruct

7B model specialised for code generation and understanding. Strong on code completion, explanation, debugging, and multi-language programming tasks. Good fit for coding assistant integrations.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen2.5-coder-7b-instruct",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

qwen/qwen3-14b

Qwen3 14B dense instruction model from Alibaba. Strong reasoning, multilingual capability, and instruction following. Fits on gpu-sm with 4-bit quantisation (~7GB weights, ~10GB total with KV cache).

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen3-14b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

qwen/qwen3-4b

Qwen3 4B dense instruction model. Improved reasoning and instruction following over Qwen2 at the same parameter count. Good for general chat and structured output on CPU.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen3-4b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

qwen/qwen3-4b-thinking-2507

Qwen3 4B model with extended reasoning (thinking) capability. Applies chain-of-thought before responding. KV cache growth during thinking warrants gpu-sm placement over CPU.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen3-4b-thinking-2507",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

qwen/qwen3-8b

Qwen3 8B dense instruction model. Strong general-purpose reasoning and chat. Efficient on gpu-sm with 4-bit quantisation and good throughput on T4.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen3-8b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

qwen/qwen3.5-0.8b

Very compact 0.8B instruction model. Suited for latency-sensitive tasks, edge-like deployments, and high-throughput lightweight classification or routing.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen3.5-0.8b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

qwen/qwen3.5-9b

Qwen3.5 9B dense instruction model. Strong reasoning and instruction following. Runs on gpu-sm with 4-bit quantisation.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen3.5-9b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

tiiuae/falcon3-7b-instruct

7B instruction model from the Technology Innovation Institute. Trained on 14 trillion tokens. Strong general-purpose reasoning and instruction following for CPU-tier chat and structured output tasks. Apache 2.0 licence.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tiiuae/falcon3-7b-instruct",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

zai-org/glm-4-9b-0414

9B instruction model from Zhipu AI. Strong multilingual capability with good Chinese and English performance. Suited for bilingual chat and document tasks.

chat → chat

OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "zai-org/glm-4-9b-0414",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

text embedding

baai/bge-m3

Dense and sparse text embedding model supporting 100+ languages. Strong on multilingual retrieval, semantic search, and cross-lingual similarity. Good choice for RAG systems with mixed-language content. 1024-dimensional output.

text → vector/1024

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "text-embedding",
    "model_name": "baai/bge-m3",
    "input": "The text to embed"
  }'

baai/bge-small-en-v1.5

Small and efficient English embedding model from BAAI. Good balance of speed and quality for English-only semantic search and retrieval. 512-dimensional output.

text → vector/512

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "text-embedding",
    "model_name": "baai/bge-small-en-v1.5",
    "input": "The text to embed"
  }'

intfloat/multilingual-e5-large-instruct

560M multilingual instruction-following embedding model. Covers 94 languages. Takes an instruction prefix describing the embedding task, improving retrieval quality for domain-specific use cases. MIT licence.

text → vector/1024

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "text-embedding",
    "model_name": "intfloat/multilingual-e5-large-instruct",
    "input": "The text to embed"
  }'

microsoft/harrier-oss-v1-0.6b

600M parameter text and image embedding model from Microsoft. Supports web search, semantic similarity, and bitext retrieval. Multimodal capable. 1024-dimensional output. Higher quality than the 270M variant.

text → vector/1024

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "text-embedding",
    "model_name": "microsoft/harrier-oss-v1-0.6b",
    "input": "The text to embed"
  }'

nomic-ai/nomic-embed-text-v1.5

English text embedding model from Nomic AI. 8192 token context. Strong on long document retrieval and semantic search. Matryoshka representation allows truncating to smaller dimensions without retraining. Apache 2.0 licence.

text → vector/768

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "text-embedding",
    "model_name": "nomic-ai/nomic-embed-text-v1.5",
    "input": "The text to embed"
  }'

sentence-transformers/all-minilm-l6-v2

Compact and fast English sentence embedding model. Good for high-throughput semantic search and clustering where latency matters more than top-tier accuracy. 384-dimensional output.

text → vector/384

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "text-embedding",
    "model_name": "sentence-transformers/all-minilm-l6-v2",
    "input": "The text to embed"
  }'

sentence-transformers/sentence-t5-large

English sentence embedding model based on T5-Large. Strong on semantic textual similarity and sentence retrieval tasks. Higher quality than smaller MiniLM variants at the cost of inference time.

text → vector/768

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "text-embedding",
    "model_name": "sentence-transformers/sentence-t5-large",
    "input": "The text to embed"
  }'

thenlper/gte-large

335M English embedding model with strong MTEB benchmark scores. Fast and compact. Good for high-throughput semantic search where bge-m3 is too heavy. MIT licence.

text → vector/1024

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "text-embedding",
    "model_name": "thenlper/gte-large",
    "input": "The text to embed"
  }'

text eval

dslim/bert-base-ner

BERT-base named entity recognition model. Extracts persons, organisations, locations, and miscellaneous entities from text. Good for document parsing and entity extraction pipelines.

text → json

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "text-eval",
    "model_name": "dslim/bert-base-ner",
    "input": "..."
  }'

dslim/bert-large-ner

BERT-large named entity recognition model. Higher accuracy than the base variant for entity extraction. Better suited for precision-sensitive document parsing tasks.

text → json

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "text-eval",
    "model_name": "dslim/bert-large-ner",
    "input": "..."
  }'

openai/privacy-filter

PII detection and privacy filtering model. Identifies personal information in text including names, addresses, and identifiers. Used in data sanitisation pipelines before downstream processing.

text → json

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "text-eval",
    "model_name": "openai/privacy-filter",
    "input": "..."
  }'

tts

facebook/mms-tts-cym

Welsh text-to-speech model from Meta MMS. One of very few open-weight TTS models with Welsh language support.

text → speech

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "tts",
    "model_name": "facebook/mms-tts-cym",
    "text": "Hello world",
    "language_code": "en-gb"
  }'

facebook/mms-tts-deu

German text-to-speech model from Meta MMS.

text → speech

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "tts",
    "model_name": "facebook/mms-tts-deu",
    "text": "Hello world",
    "language_code": "en-gb"
  }'

facebook/mms-tts-eng

English text-to-speech model from Meta MMS. Fast and lightweight for speech synthesis in English.

text → speech

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "tts",
    "model_name": "facebook/mms-tts-eng",
    "text": "Hello world",
    "language_code": "en-gb"
  }'

facebook/mms-tts-fin

Finnish text-to-speech model from Meta MMS.

text → speech

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "tts",
    "model_name": "facebook/mms-tts-fin",
    "text": "Hello world",
    "language_code": "en-gb"
  }'

facebook/mms-tts-fra

French text-to-speech model from Meta MMS.

text → speech

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "tts",
    "model_name": "facebook/mms-tts-fra",
    "text": "Hello world",
    "language_code": "en-gb"
  }'

facebook/mms-tts-nld

Dutch text-to-speech model from Meta MMS.

text → speech

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "tts",
    "model_name": "facebook/mms-tts-nld",
    "text": "Hello world",
    "language_code": "en-gb"
  }'

facebook/mms-tts-spa

Spanish text-to-speech model from Meta MMS.

text → speech

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "tts",
    "model_name": "facebook/mms-tts-spa",
    "text": "Hello world",
    "language_code": "en-gb"
  }'

txt2audio

cvssp/audioldm2

Latent diffusion model for general audio and sound effect generation. Produces diverse audio from text prompts. Good for sound design and non-music audio generation.

text → audio

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "txt2audio",
    "model_name": "cvssp/audioldm2",
    "input": "..."
  }'

facebook/musicgen-small

Small music generation model from Meta. Generates short audio clips from text descriptions. Good for background music generation and audio prototyping.

text → audio

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "txt2audio",
    "model_name": "facebook/musicgen-small",
    "input": "..."
  }'

txt2img

black-forest-labs/flux.1-schnell

FLUX.1 Schnell from Black Forest Labs. Fast high-quality text-to-image generation. Schnell variant optimised for speed with minimal quality trade-off.

text → image

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "txt2img",
    "model_name": "black-forest-labs/flux.1-schnell",
    "input": "..."
  }'

stabilityai/stable-diffusion-3.5-large-turbo

Stable Diffusion 3.5 Large Turbo from Stability AI. High-quality text-to-image generation with strong prompt adherence.

text → image

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "txt2img",
    "model_name": "stabilityai/stable-diffusion-3.5-large-turbo",
    "input": "..."
  }'

tongyi-mai/z-image-turbo

Z-Image Turbo excels in photorealistic image generation, bilingual text rendering (English & Chinese), and robust instruction adherence.

text → image

Native async API

curl https://api.marigold.run/infer \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_type": "txt2img",
    "model_name": "tongyi-mai/z-image-turbo",
    "input": "..."
  }'

Every model. One API.

facebook/dpt-dinov2-small-kitti

http

google/siglip-base-patch16-224

google/siglip2-so400m-patch14-224

cafeai/cafe_aesthetic

falconsai/nsfw_image_detection

clip-vit-b-32

openai/clip-vit-large-patch14

facebook/sam-vit-huge

deepseek-ai/deepseek-r1-distill-llama-8b

deepseek-ai/deepseek-r1-distill-qwen-7b

google/gemma-3-12b-it

google/gemma-3-27b-it

google/gemma-3-4b-it

ibm-granite/granite-3.3-8b-instruct

madeagents/hammer2.1-3b

meta-llama/llama-3.1-8b-instruct

meta-llama/llama-3.2-1b-instruct

microsoft/phi-4

mistralai/mistral-7b-instruct-v0.3

qwen/qwen2-0.5b-instruct

qwen/qwen2-1.5b-instruct

qwen/qwen2.5-7b-instruct

qwen/qwen2.5-coder-7b-instruct

qwen/qwen3-14b

qwen/qwen3-4b

qwen/qwen3-4b-thinking-2507

qwen/qwen3-8b

qwen/qwen3.5-0.8b

qwen/qwen3.5-9b

tiiuae/falcon3-7b-instruct

zai-org/glm-4-9b-0414

baai/bge-m3

baai/bge-small-en-v1.5

intfloat/multilingual-e5-large-instruct

microsoft/harrier-oss-v1-0.6b

nomic-ai/nomic-embed-text-v1.5

sentence-transformers/all-minilm-l6-v2

sentence-transformers/sentence-t5-large

thenlper/gte-large

dslim/bert-base-ner

dslim/bert-large-ner

openai/privacy-filter

facebook/mms-tts-cym

facebook/mms-tts-deu

facebook/mms-tts-eng

facebook/mms-tts-fin

facebook/mms-tts-fra

facebook/mms-tts-nld

facebook/mms-tts-spa

cvssp/audioldm2

facebook/musicgen-small

black-forest-labs/flux.1-schnell

stabilityai/stable-diffusion-3.5-large-turbo

tongyi-mai/z-image-turbo

Open weights. Private infrastructure.

Join the waitlist