Skip to content

AI Provider System

MIFY provides unified access to 140+ AI providers / 2,600+ models / 35 unique endpoints via the LiteLLM gateway, plus 16 first-class native integrations. (LiteLLM adds providers continuously — see the canonical link above for the live count.)

Your Workflow
→ MIFY selects provider based on pack/tier
→ AI SDK adapter normalizes the request
→ Provider receives the call
→ Response normalized back to MIFY format

You write your workflow once. MIFY handles provider differences — authentication, request format, response parsing, error handling, and cost tracking.

Provider packs group models by optimization target:

TierOptimized ForExample Models
FastLow latencyGPT-4o-mini, Claude Haiku, Gemini Flash
BalancedCost/qualityGPT-4o, Claude Sonnet, Gemini Pro
AccurateBest qualityGPT-4, Claude Opus, Gemini Ultra

Select a tier when running workflows, or configure a default.

Run AI models locally with zero API costs:

  1. Install Ollama: curl -fsSL https://ollama.com/install.sh | sh
  2. Pull a model: ollama pull phi3:mini
  3. In MIFY, select Ollama as your provider — no API key needed

Run models at the edge with Cloudflare:

  • Chat (LLaMA, Phi)
  • Embeddings (BGE)
  • Image Generation (Stable Diffusion XL)
  • Vision, Speech Recognition, Text-to-Speech
  • Translation, Classification, Object Detection

MIFY tracks AI costs per workflow run:

  • Token usage per node
  • Cost estimation based on provider pricing
  • Usage dashboard at /settings/usage
  • Admin usage overview at /admin/usage

Add your own API keys for any provider:

  1. Go to Settings → Credentials
  2. Select the provider
  3. Enter your API key
  4. The key is encrypted at rest and used for your workflows only

Admins can route LiteLLM traffic through a Cloudflare AI Gateway for caching, rate-limiting, and analytics — toggled in /admin/gateway.

If the primary provider is down or over-quota, MIFY can fall back to Cloudflare Workers AI automatically — chat, embeddings, and image generation continue working without operator intervention.

  • Three-stage fallback — flat settings (per-workspace toggle) + kill switch (org-wide off) + routing (per-route fallback model selection)
  • Admin emergency disable — admin-only /api/admin/cf-killswitch/* API engages or clears the global kill switch (no dedicated UI page yet — invoked from /admin/gateway or via API)
  • Configurable per workspace — each workspace can opt in/out and pick which CF Workers AI model to use as the fallback target
  • Auto-wired — when a workspace has CF fallback configured (Option Y), ProviderResolver wires it into every LLM call
  • Embedding support — Cloudflare embeddings work as a fallback for RAG templates whose primary embedding provider is unavailable

Per-Workspace Backends (Backend Capability Registry)

Section titled “Per-Workspace Backends (Backend Capability Registry)”

Beyond LLMs, every runtime backend has a per-workspace setting:

CapabilitySetting PageWhat It Picks
LLM/workspaces/[id]/settings/llmWhich provider key/region/model to use
Document Parser/workspaces/[id]/settings/parserLocal vs Unstructured.io sidecar for PDF/DOCX
Browser/workspaces/[id]/settings/browserBrowser automation backend
Sandbox/workspaces/[id]/settings/sandboxRawHost vs Cloudflare Sandbox for code exec

Each capability also has an admin fallback page (/admin/{llm,parser,browser,sandbox}-fallback) for org/global defaults when no workspace setting exists.