LLM Providers¶
Sentimatrix supports 21 LLM providers for enhanced sentiment analysis, summarization, and insight generation. All providers are fully implemented with streaming, fallback chains, and health monitoring.
Provider Categories¶
:material-cloud: Cloud Providers
Managed API services with enterprise reliability.
OpenAI, Anthropic, Google, Mistral, Cohere, Groq
:material-server: Inference Providers
Cost-effective hosting for open-source models.
Together, Fireworks, OpenRouter, Cerebras, DeepSeek
:material-desktop-tower: Local Providers
Run models locally for privacy and cost savings.
Ollama, LM Studio, vLLM, llama.cpp, ExLlamaV2, Text-Gen-WebUI
:material-office-building: Enterprise
Enterprise-grade deployments with compliance.
Azure OpenAI, AWS Bedrock
Quick Comparison¶
| Provider | Best For | Pricing | Streaming | Vision |
|---|---|---|---|---|
| OpenAI | Quality & features | $$$ | ||
| Anthropic | Safety & reasoning | $$$ | ||
| Multimodal & long context | $$ | |||
| Groq | Speed (free tier) | Free-$ | ||
| Together | OSS models | $ | ||
| DeepSeek | Value | $ | ||
| Ollama | Local/privacy | Free |
Quick Start¶
Cloud Provider (Groq - Free)¶
from sentimatrix import Sentimatrix
from sentimatrix.config import SentimatrixConfig, LLMConfig
config = SentimatrixConfig(
llm=LLMConfig(
provider="groq",
api_key="gsk_...", # or use GROQ_API_KEY env var
model="llama-3.3-70b-versatile"
)
)
async with Sentimatrix(config) as sm:
summary = await sm.summarize_reviews(reviews)
Local Provider (Ollama)¶
config = SentimatrixConfig(
llm=LLMConfig(
provider="ollama",
base_url="http://localhost:11434",
model="llama3.2"
)
)
Feature Matrix¶
| Provider | Streaming | Functions | JSON Mode | Embeddings | Vision |
|---|---|---|---|---|---|
| OpenAI | |||||
| Anthropic | |||||
| Groq | |||||
| Mistral | |||||
| Cohere | |||||
| Together | |||||
| Fireworks | |||||
| OpenRouter | |||||
| Cerebras | |||||
| DeepSeek | |||||
| Ollama | |||||
| LM Studio | |||||
| vLLM | |||||
| llama.cpp | |||||
| ExLlamaV2 | |||||
| Text-Gen-WebUI | |||||
| Azure OpenAI | |||||
| AWS Bedrock |
Pricing Comparison¶
Approximate costs per 1M tokens:
| Provider | Model | Input | Output |
|---|---|---|---|
| Groq | LLaMA 3.3 70B | Free | Free |
| Gemini 1.5 Flash | Free | Free | |
| DeepSeek | DeepSeek V3 | $0.07 | $0.27 |
| Together | LLaMA 3.1 70B | $0.88 | $0.88 |
| Fireworks | LLaMA 3.1 70B | $0.90 | $0.90 |
| OpenAI | GPT-4o-mini | $0.15 | $0.60 |
| Mistral | Mistral Large | $2.00 | $6.00 |
| OpenAI | GPT-4o | $2.50 | $10.00 |
| Anthropic | Claude 3.5 Sonnet | $3.00 | $15.00 |
| Anthropic | Claude 3 Opus | $15.00 | $75.00 |
Selection Guide¶
By Use Case¶
| Use Case | Recommended | Why |
|---|---|---|
| Getting Started | Groq | Free tier, fast, good quality |
| Production | OpenAI / Anthropic | Reliability, support |
| Cost Sensitive | DeepSeek, Together | Low cost, good quality |
| Privacy Required | Ollama, vLLM | Local execution |
| Enterprise | Azure OpenAI, Bedrock | Compliance, SLAs |
| Best Quality | Claude 3.5 Sonnet, GPT-4o | State-of-the-art |
| Speed Critical | Groq, Cerebras | Ultra-low latency |
By Budget¶
- Groq - Free tier with generous limits
- Google Gemini - Free tier available
- Ollama - Free (local hardware required)
- DeepSeek - Very low cost
- Together AI - $0.88/1M tokens
- Fireworks - $0.90/1M tokens
- OpenAI GPT-4o-mini - $0.15-0.60/1M tokens
- OpenAI GPT-4o - Best balance of quality/cost
- Anthropic Claude 3.5 Sonnet - Best for reasoning
- Google Gemini 1.5 Pro - Best for long context
Configuration¶
Environment Variables¶
# Cloud providers
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."
export GROQ_API_KEY="gsk_..."
export MISTRAL_API_KEY="..."
export COHERE_API_KEY="..."
# Inference providers
export TOGETHER_API_KEY="..."
export FIREWORKS_API_KEY="..."
export OPENROUTER_API_KEY="..."
export DEEPSEEK_API_KEY="..."
# Local providers
export OLLAMA_HOST="http://localhost:11434"
export LMSTUDIO_HOST="http://localhost:1234"
YAML Configuration¶
sentimatrix.yaml
llm:
provider: groq
model: llama-3.3-70b-versatile
# Optional settings
temperature: 0.7
max_tokens: 4096
timeout: 30
# Fallback providers
fallback:
- provider: together
model: meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
- provider: ollama
model: llama3.2
Provider Documentation¶
Detailed documentation for each provider:
Cloud Providers¶
- OpenAI - GPT-4o, GPT-4o-mini, o1
- Anthropic - Claude 3.5 Sonnet, Claude 3
- Google Gemini - Gemini 2.0 Flash, 1.5 Pro
- Mistral - Mistral 7B, 8x7B, Large
- Cohere - Command R, Command R+
- Groq - LLaMA 3.3, Mixtral
Inference Providers¶
- Together AI - 200+ models
- Fireworks - Fast OSS inference
- OpenRouter - Model aggregator
- Cerebras - Ultra-fast inference
- DeepSeek - DeepSeek V3, R1
Local Providers¶
- Ollama - Local LLM server
- LM Studio - Desktop GUI
- vLLM - Production server
- llama.cpp - Portable inference
- ExLlamaV2 - Quantized models
- Text-Gen-WebUI - Web-based UI
Enterprise¶
- Azure OpenAI - Microsoft Azure
- AWS Bedrock - Amazon Web Services
Provider Manager Features¶
Sentimatrix includes a sophisticated provider manager with:
- Fallback Chains: Automatically switch to backup providers on failure
- Health Monitoring: Track provider availability and response times
- Rate Limit Handling: Automatic backoff and retry on rate limits
- Load Balancing: Distribute requests across multiple providers
- Lazy Loading: Providers initialized only when needed