Open-weight models

Qwen. Mistral. MMS-TTS. Hosted.

Open-weight HuggingFace models via a typed async API. Ready on demand, flat monthly pricing.

The hosted model registry covers instruct, embedding, text-to-speech, image-to-text, depth estimation, segmentation, and eval model types. One API, one auth token, no per-token billing regardless of how many times you call any model.

See the registry View pricing

Hosted model registry

Instruct (text generation)

Instruction-following and chat completion. OpenAI-compatible endpoint available. Qwen2.5 covers 1.5B, 7B, and 14B parameter variants.

qwen/qwen2.5-1.5b-instruct qwen/qwen2.5-7b-instruct qwen/qwen2.5-14b-instruct mistralai/mistral-7b-instruct-v0.3

Text and image embedding

Dense vector representations for search, clustering, and retrieval. CLIP serves both image and text embedding, enabling cross-modal similarity in a single pipeline.

sentence-transformers/all-MiniLM-L6-v2 openai/clip-vit-large-patch14

Image to text

Vision-language model for captioning, structured description, and visual question answering. Accepts image references as input.

google/paligemma-3b-pt-224

Text to speech

Language-specific synthesis via the Facebook MMS family. Welsh (mms-tts-cym) is a specific capability absent from most commercial TTS services.

facebook/mms-tts-eng facebook/mms-tts-cym facebook/mms-tts-fra facebook/mms-tts-deu facebook/mms-tts-spa facebook/mms-tts-fin facebook/mms-tts-nld

Depth and segmentation

Monocular depth estimation and image segmentation for vision pipelines. Used in change detection, conformance checking, and spatial analysis.

depth-estimation img2mask

Evaluation models

Text, image, and cross-modal scoring for pipeline quality assessment. Run any model against a labelled dataset using the same API.

text-eval image-eval image-text-eval

Calling a model

The same API across all model types

Submit via the OpenAI-compatible endpoint for instruct models, or via the native async API for any model type. Binary outputs are returned as storage references, not inlined in the response.

Qwen instruct via OpenAI-compatible endpoint

curl https://api.marigold.run/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen2.5-7b-instruct",
    "messages": [{"role": "user", "content": "Translate to Welsh: Good morning."}]
  }'

TTS via native async API

import requests, time

API = "https://api.marigold.run"
KEY = {"Authorization": "Bearer your-api-key"}

job = requests.post(f"{API}/infer", headers=KEY, json={
    "model_type": "tts",
    "model_name": "facebook/mms-tts-cym",
    "text": "Bore da, sut ydych chi?"
}).json()

while True:
    r = requests.get(f"{API}/infer/{job['job_id']}", headers=KEY).json()
    if r["status"] == "complete":
        print(r["output"])   # storage reference to audio file
        break
    time.sleep(0.5)

Custom models

Request a model not in the registry

Any HuggingFace model compatible with the standard handler types can be onboarded. Custom model onboarding is available on the Pro tier.

Send the HuggingFace model ID and intended use case. If it fits an existing handler type, onboarding is typically a day's work.

Request a model

Open weights. Private infrastructure.

Leave your email to be notified when access opens. Mention specific models and we will confirm registry availability.

Join the waitlist

No spam. One email when access opens.

Request access

Noted. We will be in touch.