This guide provides comprehensive information about configuring and using language models (LLMs) in the SmythOS SDK.
SmythOS supports a wide range of language model providers, from cloud-based services like OpenAI and Anthropic to locally-hosted models via Ollama. The SDK provides flexible configuration options to suit different use cases, from simple model selection to fine-grained parameter control.
There are three main ways to configure models in SmythOS:
The quickest way to get started. Just specify the model name as a string:
const agent = new Agent({
name: 'My Agent',
behavior: 'You are a helpful assistant',
model: 'gpt-4o', // Simple and straightforward
});
Benefits:
Common Model Names:
'gpt-4o', 'gpt-4-turbo', 'gpt-4', 'gpt-3.5-turbo''claude-4-sonnet', 'claude-3.5-sonnet', 'claude-3-opus', 'claude-3-sonnet', 'claude-3-haiku''gemini-pro', 'gemini-1.5-pro', 'gemini-1.5-flash''deepseek-chat', 'deepseek-coder'When you need to specify a provider explicitly or use a model not in the default list:
import { Model } from '@smythos/sdk';
const agent = new Agent({
name: 'My Agent',
behavior: 'You are a helpful assistant',
model: Model.OpenAI('gpt-4o'),
});
Benefits:
For full control over model behavior and parameters:
import { Model } from '@smythos/sdk';
const agent = new Agent({
name: 'My Agent',
behavior: 'You are a helpful assistant',
model: Model.OpenAI('gpt-4o', {
temperature: 0.7, // Control randomness (0.0 - 2.0)
maxTokens: 2000, // Maximum response length
topP: 0.9, // Nucleus sampling parameter
frequencyPenalty: 0.0, // Reduce repetition (0.0 - 2.0)
presencePenalty: 0.0, // Encourage topic diversity (0.0 - 2.0)
}),
});
Benefits:
SmythOS supports the following LLM providers:
Access GPT-4, GPT-3.5, and other OpenAI models:
model: Model.OpenAI('gpt-4o', {
temperature: 0.7,
maxTokens: 4000,
topP: 1.0,
frequencyPenalty: 0.0,
presencePenalty: 0.0,
});
Popular Models:
gpt-4o - Latest optimized GPT-4 (multimodal)gpt-4-turbo - Fast GPT-4 variantgpt-4 - Standard GPT-4gpt-3.5-turbo - Fast and cost-effectiveAPI Key: Store in vault as "openai": "sk-..." (see API Key Management below).
Access Claude models from Anthropic:
model: Model.Anthropic('claude-4-sonnet', {
temperature: 1.0,
maxTokens: 8192,
topP: 0.9,
});
Popular Models:
claude-4-sonnet - Latest Claude modelclaude-3.5-sonnet - Balanced performanceclaude-3-opus - Most capable modelclaude-3-sonnet - Fast and balancedclaude-3-haiku - Fastest, most cost-effectiveAPI Key: Store in vault as "anthropic": "sk-ant-..." (see API Key Management below).
Access Google's Gemini models:
model: Model.GoogleAI('gemini-1.5-pro', {
temperature: 0.8,
maxTokens: 2048,
topP: 0.95,
});
Popular Models:
gemini-1.5-pro - Advanced reasoning and long contextgemini-1.5-flash - Fast and efficientgemini-pro - Standard modelAPI Key: Store in vault as "googleai": "..." (see API Key Management below).
Ultra-fast inference with Groq's LPU™ technology:
model: Model.Groq('llama-3.1-70b-versatile', {
temperature: 0.5,
maxTokens: 1024,
});
Popular Models:
llama-3.1-70b-versatile - Meta's Llama 3.1 (70B)llama-3.1-8b-instant - Smaller, faster Llama 3.1mixtral-8x7b-32768 - Mixtral MoE modelAPI Key: Store in vault as "groq": "gsk_..." (see API Key Management below).
Run models locally with Ollama:
model: Model.Ollama('llama3.2', {
temperature: 0.7,
numCtx: 4096, // Context window size
});
Setup:
ollama pull llama3.2Popular Models:
llama3.2 - Latest Llama modelmistral - Mistral 7Bcodellama - Code-specialized Llamaphi3 - Microsoft's Phi-3Configuration: No API key needed for local Ollama. Configure host URL in SRE config if not using default http://localhost:11434
Access DeepSeek models:
model: Model.DeepSeek('deepseek-chat', {
temperature: 0.7,
maxTokens: 2000,
});
Popular Models:
deepseek-chat - General conversation modeldeepseek-coder - Specialized for coding tasksAPI Key: Store in vault as "deepseek": "..." (see API Key Management below).
Access various open-source models via TogetherAI:
model: Model.TogetherAI('meta-llama/Llama-3-70b-chat-hf', {
temperature: 0.7,
maxTokens: 2000,
});
API Key: Store in vault as "togetherai": "..." (see API Key Management below).
model: Model.xAI('grok-beta', {
temperature: 0.7,
maxTokens: 2000,
});
API Key: Store in vault as "xai": "..." (see API Key Management below).
model: Model.Perplexity('llama-3.1-sonar-large-128k-online', {
temperature: 0.7,
maxTokens: 2000,
});
API Key: Store in vault as "perplexity": "..." (see API Key Management below).
Most providers support these common parameters. Here's a quick reference table:
| Parameter | Type | Range/Format | Description | Supported By |
|---|---|---|---|---|
temperature |
number | 0.0 - 2.0 | Controls randomness in responses. Lower = more deterministic, higher = more creative | All providers |
maxTokens |
number | 1 - model limit | Maximum number of tokens in the response | All providers |
topP |
number | 0.0 - 1.0 | Nucleus sampling parameter. Controls diversity via cumulative probability | Most providers |
topK |
number | 0 - ∞ | Limits token selection to top K most likely tokens. 0 = disabled | Ollama, some providers |
frequencyPenalty |
number | 0.0 - 2.0 | Reduces repetition of token sequences | OpenAI, compatible providers |
presencePenalty |
number | 0.0 - 2.0 | Encourages talking about new topics | OpenAI, compatible providers |
stopSequences |
string[] | Array of strings | Sequences where the model will stop generating | Most providers |
inputTokens |
number | 1 - model limit | Maximum context window size (input tokens). Should be ≤ model's official context window | All providers |
outputTokens |
number | 1 - model limit | The maximum tokens that the model can generate in a single response | All providers |
maxThinkingTokens |
number | 1 - model limit | Maximum tokens for reasoning/thinking (reasoning models only) | OpenAI o1, compatible models |
baseURL |
string | Valid URL | Custom API endpoint URL for model inference | Most providers |
Controls randomness in responses (typically 0.0 - 2.0):
0.0 - Deterministic, focused responses0.5 - Balanced creativity and consistency1.0 - More creative and varied2.0 - Highly random and creativemodel: Model.OpenAI('gpt-4o', { temperature: 0.7 });
Maximum number of tokens in the response:
model: Model.OpenAI('gpt-4o', { maxTokens: 2000 });
Note: Tokens are not the same as words. Roughly 1 token ≈ 0.75 words.
Nucleus sampling parameter (0.0 - 1.0):
model: Model.Anthropic('claude-3-sonnet', { topP: 0.9 });
Reduces repetition of token sequences (0.0 - 2.0):
model: Model.OpenAI('gpt-4o', { frequencyPenalty: 0.5 });
Encourages talking about new topics (0.0 - 2.0):
model: Model.OpenAI('gpt-4o', { presencePenalty: 0.5 });
SmythOS stores API keys securely in a Vault system. By default, the SDK uses a JSON file-based vault located at .smyth/vault.json.
The SDK automatically initializes with a JSON vault at .smyth/vault.json. This file stores API keys for different providers:
{
"default": {
"openai": "sk-...",
"anthropic": "sk-ant-...",
"googleai": "...",
"groq": "gsk_...",
"deepseek": "...",
"togetherai": "...",
"xai": "...",
"perplexity": "..."
}
}
Location Options:
.smyth/vault.json in your project directory (recommended)~/.smyth/vault.json in your home directory (applies to all projects)You can reference environment variables within the vault file using the $env() syntax:
{
"default": {
"openai": "$env(OPENAI_API_KEY)",
"anthropic": "$env(ANTHROPIC_API_KEY)",
"googleai": "$env(GOOGLE_API_KEY)"
}
}
Then set your environment variables:
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."
For advanced use cases, you can configure a different vault system:
import { SRE } from '@smythos/sdk/core';
SRE.init({
Vault: {
Connector: 'JSONFileVault',
Settings: {
file: './custom-vault.json',
},
},
});
Or use cloud-based secret management systems like AWS Secrets Manager for production environments.
The vault supports multiple teams with isolated API keys:
{
"default": {
"openai": "sk-default-key..."
},
"team-production": {
"openai": "sk-prod-key...",
"anthropic": "sk-ant-prod-..."
},
"team-development": {
"openai": "sk-dev-key..."
}
}
Agents automatically use keys from their assigned team's vault section.
temperature: 0.0 - 0.3temperature: 0.5 - 0.7temperature: 0.8 - 1.2maxTokens to prevent unexpectedly long (and costly) responsesDifferent providers excel at different tasks. Test multiple options to find the best fit for your specific use case.
maxTokens limitsimport { Agent, Model } from '@smythos/sdk';
const agent = new Agent({
name: 'Budget Assistant',
behavior: 'You are a helpful assistant',
// Use GPT-3.5 for cost efficiency
model: Model.OpenAI('gpt-3.5-turbo', {
temperature: 0.5,
maxTokens: 500, // Keep responses concise
}),
});
const agent = new Agent({
name: 'Research Assistant',
behavior: 'You are an expert researcher',
// Use Claude Opus for complex reasoning
model: Model.Anthropic('claude-3-opus', {
temperature: 0.3, // More focused
maxTokens: 4096, // Allow detailed responses
}),
});
const agent = new Agent({
name: 'Dev Assistant',
behavior: 'You are a coding assistant',
// Use Ollama for local development
model: Model.Ollama('llama3.2', {
temperature: 0.7,
numCtx: 4096,
}),
});
const agent = new Agent({
name: 'Quick Responder',
behavior: 'You provide quick answers',
// Use Groq for ultra-fast inference
model: Model.Groq('llama-3.1-70b-versatile', {
temperature: 0.5,
maxTokens: 1024,
}),
});
import { Agent, Model } from '@smythos/sdk';
import path from 'path';
const agentPath = path.resolve(__dirname, './workflow-agent.smyth');
// Import workflow but override the model
const agent = Agent.import(agentPath, {
model: Model.Anthropic('claude-3.5-sonnet', {
temperature: 0.8,
}),
});
Now that you understand model configuration, explore: