🏆 Top Score 🔥 Trending 🆕 Newest

Layers

Multimodal Agents

AI agents that process text, images, audio, and video· 48 agents

15

Top Score

3.1565656565656562e+69

Avg Score

0 of 48

Verified

30

Free / Freemium

by Meta AI

Meta's latest open-source LLM family. Maverick (400B MoE) rivals GPT-5. Scout (17B) runs on consumer hardware. Multimodal with vision. Fully open weights.

LayersMultimodal Agents50

by DeepSeek

DeepSeek V3.2 — 685B MoE open-source frontier model. Matches GPT-5 and Claude 4.5 on most benchmarks at near-zero inference cost. Freely downloadable.

LayersMultimodal Agents50

by Alibaba Cloud / Tongyi

Alibaba's flagship open-source LLM. 235B MoE (22B active). Multilingual, strong on coding and math. Qwen3-Coder variant matches Claude Code on HumanEval.

LayersMultimodal Agents50

by Google DeepMind

Google's open-weight model family for on-device and research use. 2B to 27B parameters. Runs on laptops, phones, and edge devices. Strong safety tuning.

LayersMultimodal Agents50

by Meta AI

Meta AI powered by Llama 4. Built into WhatsApp, Instagram, Facebook, and Messenger for 3B+ users. Web search, image generation, and real-time answers.

LayersMultimodal AgentsFree50

by Hugging Face

Hugging Face's open-source chat UI for any model. Access Llama 4, DeepSeek, Mistral, Gemma, and 100+ open-weight models. Free, no API key required.

LayersMultimodal AgentsFree50

Stable Diffusion / FLUX

by Stability AI / Black Forest Labs

FLUX 1.1 Pro Ultra by Black Forest Labs — current state of the art in open-source image generation. Photorealistic, fast, commercially licensable. 100M+ imag...

LayersMultimodal AgentsFreemium50

by Google DeepMind

Google DeepMind's multimodal AI assistant. Gemini 2.5 Pro with native thinking, 1M token context, and tight integration across Google Workspace, Android, and Search.

LayersMultimodal AgentsFreemium46

by Anthropic

Anthropic's AI assistant powered by Claude Opus 4.6 and Sonnet 4.6. Extended thinking, 200K context, and 300K output via Batches API. Strong in coding, analysis, and nuanced reasoning.

LayersMultimodal AgentsFreemium46

by OpenAI

OpenAI's flagship AI assistant powered by GPT-5 and GPT-5.2 Thinking. Unified system with intelligent routing between fast responses and deep reasoning. The most widely used AI chatbot globally.

LayersMultimodal AgentsFreemium46

by Leonardo AI

AI creative suite with 150M+ users. Fine-tuned models for gaming assets, product images, and social media. Real-time canvas, video gen, and 3D asset pipeline.

LayersMultimodal AgentsFreemium63

by Google DeepMind

Google DeepMind's latest video generation model. Veo 3.1 creates 4K video with native audio — ambient sounds, dialogue, music — all from a single prompt.

LayersMultimodal AgentsPaid58

by Ideogram

Best-in-class AI image generator for text rendering. Ideogram v3 produces accurate, beautiful typography in images — a longstanding AI limitation now solved.

LayersMultimodal AgentsFreemium58

by Adobe

Adobe's commercially-safe generative AI. Trained on licensed content — zero copyright risk. Integrated into Photoshop, Illustrator, Premiere Pro, and Express.

LayersMultimodal AgentsFreemium58

by Kuaishou Technology

Kuaishou's Kling 3.0 — top-ranked AI video generator on LogRocket. Cinematic quality, superior character consistency, and affordable pricing vs Runway.

LayersMultimodal AgentsFreemium56

Luma Dream Machine

by Luma AI

Luma AI's video generation model. Photorealistic, physically accurate 5-second clips from text or images. Used by Hollywood VFX studios.

LayersMultimodal AgentsFreemium56

by HeyGen

AI video platform for creating talking-avatar videos. Used by 500K+ businesses for training, marketing, and product videos. 175+ AI avatars, 40+ languages.

LayersMultimodal AgentsPaid49

by Synthesia

AI video generation platform with human avatars. Create training, marketing, and onboarding videos in 140+ languages without cameras or studios.

LayersMultimodal AgentsPaid48

by OpenAI

OpenAI's second-generation video model. Cinema-quality 1080p video up to 60 seconds from text, image, or video. Physics simulation, precise camera control.

LayersMultimodal AgentsPaid47

by Midjourney Inc

The leading AI image generator for artistic and commercial work. V7 introduces consistent characters, style references, and improved photorealism. 25M+ users.

LayersMultimodal AgentsPaid45

by Runway AI

Hollywood-grade AI video generation. Gen-4 Turbo produces 4K video clips with reference-consistent characters. Used by major studios and content creators.

LayersMultimodal AgentsFreemium41

by Quora

Multi-model AI chat by Quora. One subscription accesses Claude, GPT-5, Gemini, Llama 4, and 100+ models. Create and monetize custom bots.

LayersMultimodal AgentsFreemium35

Mistral Le Chat

by Mistral AI

Mistral AI's chat powered by Mistral Large 3. Ultra-fast, multilingual, canvas mode, web search, and document analysis. Europe's leading LLM company.

LayersMultimodal AgentsFreemium20

by xAI

xAI's AI powered by Grok 4 — four AI agents running in parallel. Real-time X/Twitter data, Aurora image gen, video understanding, and deep reasoning.

LayersMultimodal AgentsFreemium16

Inception: Mercury 2

by inception

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and...

LayersMultimodal AgentsUsage95

OpenAI: GPT 5.4

by openai

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K ou...

LayersMultimodal AgentsUsage95

Google: Lyria 3 Clip Preview

by google

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3,...

LayersMultimodal AgentsFree77

Google: Gemma 4 31B (free)

by google

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window...

LayersMultimodal AgentsFree77

Google: Gemma 4 26B A4B (free)

by google

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token ...

LayersMultimodal AgentsFree77

Google: Gemma 4 26B A4B

by google

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token ...

LayersMultimodal AgentsUsage71

Anthropic: Claude Opus Latest

by ~anthropic

This model always redirects to the latest model in the Claude Opus family.

LayersMultimodal AgentsUsage71

OpenAI: GPT 5.3 Chat

by openai

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more a...

LayersMultimodal AgentsUsage63

OpenAI: GPT 5.4 Pro

by openai

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. ...

LayersMultimodal AgentsUsage63

Google: Lyria 3 Pro Preview

by google

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you ca...

LayersMultimodal AgentsFree60

Google: Gemma 4 31B

by google

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window...

LayersMultimodal AgentsUsage60

Mistral: Mistral Small 4

by mistralai

Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into a single system. It ...

LayersMultimodal AgentsUsage57

by GitHub Actions

Formula WorkPaper runtime for Node.js services and agent tools with JSON persistence and formula readback.

LayersMultimodal AgentsFree52

by isdk

AI Agent Script is a framework for defining AI Agents, their properties, and behaviors for interactive conversations. This document provides an overview of t...

LayersMultimodal AgentsFree52

by wfmedia

Cognitive browser automation that thinks like your users—and helps AI agents navigate too. Simulate real user cognition with abandonment detection, constitut...

LayersMultimodal AgentsFree52

by djtony707

TITAN — Autonomous AI agent framework with self-improvement, multi-agent orchestration, 36 LLM providers, 16 channel adapters, GPU VRAM management, mesh netw...

LayersMultimodal AgentsFree52

by GitHub Actions

MCP server + Excalidraw whiteboard UI for AI-assisted diagramming (Claude Code / Codex).

LayersMultimodal AgentsFree52

by GitHub Actions

LangChain.js adapters for Model Context Protocol (MCP)

LayersMultimodal AgentsFree52

by mcpcat

Analytics tool for MCP (Model Context Protocol) servers - tracks tool usage patterns and provides insights

LayersMultimodal AgentsFree52

by ruvnet

Production-ready AI agent orchestration platform with 66 specialized agents, 213 MCP tools, ReasoningBank learning memory, and autonomous multi-agent swarms....

LayersMultimodal AgentsFree52

by GitHub Actions

GitLab MCP server for projects, merge requests, issues, pipelines, wiki, releases, and more

LayersMultimodal AgentsFree52

by x-ai

Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instruction-following tasks, ...

LayersMultimodal AgentsUsage52

by alvbln

Alvin Bot — open-source, self-hosted autonomous AI agent on Telegram, Slack, Discord, WhatsApp, Signal, terminal & web. Built on the Claude Agent SDK with a ...

LayersMultimodal AgentsFree50

by jkheadley

Persistent autonomy infrastructure for AI agents

LayersMultimodal AgentsFree50

Have a Multimodal Agents agent?

Submit it to appear alongside 48 others in this category.

Submit in Multimodal Agents →