AI Models
Gomus AI comes with pre-configured AI models ready to use. Just sign up, pick a model, and start chatting.
How it works
All AI models in Gomus AI are provided as a managed service. Select your preferred model from the dropdown when creating a Chat Assistant or an Agent, and you're ready to go.
- Free plan — Access to 20 fast models hosted on Groq infrastructure.
- Paid plans (Base and above) — Unlock 22 additional models hosted on AWS Bedrock.
Free plan — Groq models
Free users have access to 20 models across multiple categories:
Chat & Reasoning
| Model | Notes |
|---|---|
| Llama 3.3 70B | High-quality general-purpose |
| Llama 3.1 8B Instant | Ultra-fast responses |
| Llama 4 Maverick | Latest Llama 4 generation |
| Qwen 3 32B | Multilingual |
| Kimi K2 | Advanced reasoning |
| Kimi K2 (0905) | Advanced reasoning (September update) |
| GPT OSS 120B | Large open-source GPT |
| GPT OSS 20B | Compact open-source GPT |
| Allam 2 7B | Arabic-optimized |
| Groq Compound | Agentic model with tool use |
| Groq Compound Mini | Lightweight agentic model |
Vision
| Model | Notes |
|---|---|
| Llama 4 Scout | Multimodal — image + text understanding |
Safety & Content Moderation
| Model | Notes |
|---|---|
| Llama Guard 4 12B | Input/output safety classification |
| Llama Prompt Guard 2 22M | Prompt injection detection (lightweight) |
| Llama Prompt Guard 2 86M | Prompt injection detection |
| Safety GPT OSS 20B | Content safety analysis |
Speech & Audio
| Model | Type |
|---|---|
| Whisper Large v3 | Speech-to-Text |
| Whisper Large v3 Turbo | Speech-to-Text (faster) |
| Orpheus English | Text-to-Speech |
| Orpheus Arabic Saudi | Text-to-Speech |
Groq models have very low credit costs (1 credit per 1K tokens), making them ideal for exploring Gomus AI on the Free plan.
Paid plans — AWS Bedrock models
Upgrading to a paid plan unlocks premium models hosted on AWS Bedrock:
Chat & Reasoning
| Model | Credit cost (per 1K tokens) |
|---|---|
| Claude Opus 4.6 | 55 input / 275 output |
| Claude Opus 4.5 | 55 input / 275 output |
| Claude Sonnet 4.6 | 33 input / 165 output |
| Claude Sonnet 4.5 | 33 input / 165 output |
| Claude Sonnet 4 | 30 input / 150 output |
| Claude 3.7 Sonnet | 30 input / 150 output |
| Claude 3.5 Sonnet | 30 input / 150 output |
| Claude 3 Sonnet | 30 input / 150 output |
| Claude Haiku 4.5 | 11 input / 55 output |
| Claude 3 Haiku | 3 input / 13 output |
| Amazon Nova Pro | 11 input / 42 output |
| Amazon Nova Lite | 1 input / 4 output |
| Amazon Nova Micro | 1 input / 2 output |
| Amazon Nova 2 Lite | 5 input / 36 output |
| Llama 3.2 3B (Bedrock) | 2 input / 2 output |
| Llama 3.2 1B (Bedrock) | 2 input / 2 output |
| Pixtral Large (Mistral) | 20 input / 60 output |
Embedding
| Model | Credit cost (per 1K tokens) |
|---|---|
| Cohere Embed V4 | 2 |
| Cohere Embed Multilingual V3 | 1 |
| Amazon Titan Embed Text V2 | 2 |
Rerank
| Model | Credit cost (per 1K tokens) |
|---|---|
| Cohere Rerank V3.5 | 20 |
Video
| Model | Credit cost (per 1K tokens) |
|---|---|
| TwelveLabs Pegasus V1.2 | 5 input / 75 output |
All Bedrock models require a Base plan or higher.
Subscription plans
| Plan | Monthly Credits | Price | Models | Knowledge Bases | Docs per KB | Max File Size |
|---|---|---|---|---|---|---|
| Free | 1,000 | Free | 20 Groq models | 2 | 50 | 10 MB |
| Base | 100,000 | $19.90/mo | 20 Groq + 22 Bedrock | 10 | 500 | 50 MB |
| Premium | 250,000 | $49.90/mo | 20 Groq + 22 Bedrock | 20 | Unlimited | 200 MB |
| Business | 750,000 | $149.90/mo | 20 Groq + 22 Bedrock | Unlimited | Unlimited | 500 MB |
How credits work
- Each model has a per-token credit cost (input and output tokens are priced separately).
- Credits are deducted automatically after each AI call.
Model selection
When creating a Chat Assistant or an Agent, select which model to use from the model dropdown.
If you need a specific model not currently available, contact us at [email protected].