650 lines
20 KiB
Markdown
650 lines
20 KiB
Markdown
|
|
# OpenWebUI Discord Bot - Upgrade Project
|
||
|
|
|
||
|
|
## Project Overview
|
||
|
|
|
||
|
|
This Discord bot currently interfaces with OpenWebUI to provide AI-powered responses. The goal is to upgrade it to:
|
||
|
|
1. **Switch from OpenWebUI to LiteLLM Proxy** as the backend
|
||
|
|
2. **Add MCP (Model Context Protocol) Tool Support**
|
||
|
|
3. **Implement system prompt management within the application**
|
||
|
|
|
||
|
|
## Current Architecture
|
||
|
|
|
||
|
|
### Files Structure
|
||
|
|
- **Main bot**: [v2/bot.py](v2/bot.py) - Current implementation
|
||
|
|
- **Legacy bot**: [scripts/discordbot.py](scripts/discordbot.py) - Older version with slightly different approach
|
||
|
|
- **Dependencies**: [v2/requirements.txt](v2/requirements.txt)
|
||
|
|
- **Config**: [v2/.env.example](v2/.env.example)
|
||
|
|
|
||
|
|
### Current Implementation Details
|
||
|
|
|
||
|
|
#### Bot Features (v2/bot.py)
|
||
|
|
- **Discord Integration**: Uses discord.py with message intents
|
||
|
|
- **Trigger Methods**:
|
||
|
|
- Bot mentions (@bot)
|
||
|
|
- Direct messages (DMs)
|
||
|
|
- **Message History**: Retrieves last 100 messages for context using `get_chat_history()`
|
||
|
|
- **Image Support**: Downloads and encodes images as base64, sends to API
|
||
|
|
- **API Client**: Uses OpenAI Python SDK pointing to OpenWebUI endpoint
|
||
|
|
- **Message Format**: Embeds chat history in user message context
|
||
|
|
|
||
|
|
#### Current Message Flow
|
||
|
|
1. User mentions bot or DMs it
|
||
|
|
2. Bot fetches channel history (last 100 messages)
|
||
|
|
3. Formats history as: `"AuthorName: message content"`
|
||
|
|
4. Sends to OpenWebUI with format:
|
||
|
|
```python
|
||
|
|
{
|
||
|
|
"role": "user",
|
||
|
|
"content": [
|
||
|
|
{"type": "text", "text": "##CONTEXT##\n{history}\n##ENDCONTEXT##\n\n{user_message}"},
|
||
|
|
{"type": "image_url", "image_url": {...}} # if images present
|
||
|
|
]
|
||
|
|
}
|
||
|
|
```
|
||
|
|
5. Returns AI response and replies to user
|
||
|
|
|
||
|
|
#### Current Limitations
|
||
|
|
- **No system prompt**: Context is embedded in user messages
|
||
|
|
- **No tool calling**: Cannot execute functions or use MCPs
|
||
|
|
- **OpenWebUI dependency**: Tightly coupled to OpenWebUI API structure
|
||
|
|
- **Simple history**: Just text concatenation, no proper conversation threading
|
||
|
|
- **Synchronous image download**: Uses `requests.get()` in async context (should use aiohttp)
|
||
|
|
|
||
|
|
## Target Architecture: LiteLLM + MCP Tools
|
||
|
|
|
||
|
|
### Why LiteLLM?
|
||
|
|
|
||
|
|
LiteLLM is a unified proxy that:
|
||
|
|
- **Standardizes API calls** across 100+ LLM providers (OpenAI, Anthropic, Google, etc.)
|
||
|
|
- **Native tool/function calling support** via OpenAI-compatible API
|
||
|
|
- **Built-in MCP support** for Model Context Protocol tools
|
||
|
|
- **Load balancing** and fallback between models
|
||
|
|
- **Cost tracking** and usage analytics
|
||
|
|
- **Streaming support** for real-time responses
|
||
|
|
|
||
|
|
### LiteLLM Tool Calling
|
||
|
|
|
||
|
|
LiteLLM supports the OpenAI tools format:
|
||
|
|
```python
|
||
|
|
response = client.chat.completions.create(
|
||
|
|
model="gpt-4",
|
||
|
|
messages=[...],
|
||
|
|
tools=[{
|
||
|
|
"type": "function",
|
||
|
|
"function": {
|
||
|
|
"name": "get_weather",
|
||
|
|
"description": "Get current weather",
|
||
|
|
"parameters": {...}
|
||
|
|
}
|
||
|
|
}],
|
||
|
|
tool_choice="auto"
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
### MCP (Model Context Protocol) Overview
|
||
|
|
|
||
|
|
MCP is a standard protocol for:
|
||
|
|
- **Exposing tools** to LLMs (functions they can call)
|
||
|
|
- **Providing resources** (files, APIs, databases)
|
||
|
|
- **Prompts/templates** for consistent interactions
|
||
|
|
- **Sampling** for multi-step agentic behavior
|
||
|
|
|
||
|
|
**MCP Server Examples**:
|
||
|
|
- `filesystem`: Read/write files
|
||
|
|
- `github`: Access repos, create PRs
|
||
|
|
- `postgres`: Query databases
|
||
|
|
- `brave-search`: Web search
|
||
|
|
- `slack`: Send messages, read channels
|
||
|
|
|
||
|
|
## Upgrade Plan
|
||
|
|
|
||
|
|
### Phase 1: Switch to LiteLLM Proxy
|
||
|
|
|
||
|
|
#### Configuration Changes
|
||
|
|
1. Update environment variables:
|
||
|
|
```env
|
||
|
|
DISCORD_TOKEN=your_discord_bot_token
|
||
|
|
LITELLM_API_KEY=your_litellm_api_key
|
||
|
|
LITELLM_API_BASE=http://localhost:4000 # or your LiteLLM proxy URL
|
||
|
|
MODEL_NAME=gpt-4-turbo-preview # or any LiteLLM-supported model
|
||
|
|
SYSTEM_PROMPT=your_default_system_prompt # New!
|
||
|
|
```
|
||
|
|
|
||
|
|
2. Keep using OpenAI SDK (LiteLLM is OpenAI-compatible):
|
||
|
|
```python
|
||
|
|
from openai import OpenAI
|
||
|
|
|
||
|
|
client = OpenAI(
|
||
|
|
api_key=os.getenv('LITELLM_API_KEY'),
|
||
|
|
base_url=os.getenv('LITELLM_API_BASE')
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
#### Message Format Refactor
|
||
|
|
**Current approach** (embedding context in user message):
|
||
|
|
```python
|
||
|
|
text_content = f"##CONTEXT##\n{context}\n##ENDCONTEXT##\n\n{user_message}"
|
||
|
|
messages = [{"role": "user", "content": text_content}]
|
||
|
|
```
|
||
|
|
|
||
|
|
**New approach** (proper conversation history):
|
||
|
|
```python
|
||
|
|
messages = [
|
||
|
|
{"role": "system", "content": SYSTEM_PROMPT},
|
||
|
|
# ... previous conversation messages with proper roles ...
|
||
|
|
{"role": "user", "content": user_message}
|
||
|
|
]
|
||
|
|
```
|
||
|
|
|
||
|
|
#### Benefits
|
||
|
|
- Better model understanding of conversation structure
|
||
|
|
- Separate system instructions from conversation
|
||
|
|
- Proper role attribution (user vs assistant)
|
||
|
|
- More efficient token usage
|
||
|
|
|
||
|
|
### Phase 2: Add System Prompt Management
|
||
|
|
|
||
|
|
#### Implementation Options
|
||
|
|
|
||
|
|
**Option A: Simple Environment Variable**
|
||
|
|
- Store in `.env` file
|
||
|
|
- Good for: Single, static system prompt
|
||
|
|
- Example: `SYSTEM_PROMPT="You are a helpful Discord assistant..."`
|
||
|
|
|
||
|
|
**Option B: File-Based System Prompt**
|
||
|
|
- Store in separate file (e.g., `system_prompt.txt`)
|
||
|
|
- Good for: Long, complex prompts that need version control
|
||
|
|
- Hot-reload capability
|
||
|
|
|
||
|
|
**Option C: Per-Channel/Per-Guild Prompts**
|
||
|
|
- Store in JSON/database mapping channel_id → system_prompt
|
||
|
|
- Good for: Multi-tenant bot with different personalities per server
|
||
|
|
- Example:
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"123456789": "You are a coding assistant...",
|
||
|
|
"987654321": "You are a gaming buddy..."
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Option D: User-Configurable Prompts**
|
||
|
|
- Discord slash commands to set/view system prompt
|
||
|
|
- Store in SQLite/JSON
|
||
|
|
- Commands: `/setprompt`, `/viewprompt`, `/resetprompt`
|
||
|
|
|
||
|
|
**Recommended**: Start with Option B (file-based), add Option D later for flexibility.
|
||
|
|
|
||
|
|
#### System Prompt Best Practices
|
||
|
|
1. **Define bot personality**: Tone, style, formality
|
||
|
|
2. **Set boundaries**: What bot should/shouldn't do
|
||
|
|
3. **Provide context**: "You are in a Discord server, users will mention you"
|
||
|
|
4. **Handle images**: "When users attach images, describe them..."
|
||
|
|
5. **Tool usage guidance**: "Use available tools when appropriate"
|
||
|
|
|
||
|
|
Example system prompt:
|
||
|
|
```
|
||
|
|
You are a helpful AI assistant integrated into Discord. Users will interact with you by mentioning you or sending direct messages.
|
||
|
|
|
||
|
|
Key behaviors:
|
||
|
|
- Be concise and friendly
|
||
|
|
- Use Discord markdown formatting when helpful (code blocks, bold, etc.)
|
||
|
|
- When users attach images, analyze them and provide relevant insights
|
||
|
|
- You have access to various tools - use them when they would help answer the user's question
|
||
|
|
- If you're unsure about something, say so
|
||
|
|
- Keep track of conversation context
|
||
|
|
|
||
|
|
You are not a human, and you should not pretend to be one. Be honest about your capabilities and limitations.
|
||
|
|
```
|
||
|
|
|
||
|
|
### Phase 3: Implement MCP Tool Support
|
||
|
|
|
||
|
|
#### LiteLLM MCP Integration
|
||
|
|
|
||
|
|
LiteLLM can connect to MCP servers in two ways:
|
||
|
|
|
||
|
|
**1. Via LiteLLM Proxy Configuration**
|
||
|
|
Configure in `litellm_config.yaml`:
|
||
|
|
```yaml
|
||
|
|
model_list:
|
||
|
|
- model_name: gpt-4-with-tools
|
||
|
|
litellm_params:
|
||
|
|
model: gpt-4-turbo-preview
|
||
|
|
api_key: os.environ/OPENAI_API_KEY
|
||
|
|
|
||
|
|
mcp_servers:
|
||
|
|
filesystem:
|
||
|
|
command: npx
|
||
|
|
args: [-y, @modelcontextprotocol/server-filesystem, /allowed/path]
|
||
|
|
github:
|
||
|
|
command: npx
|
||
|
|
args: [-y, @modelcontextprotocol/server-github]
|
||
|
|
env:
|
||
|
|
GITHUB_TOKEN: ${GITHUB_TOKEN}
|
||
|
|
```
|
||
|
|
|
||
|
|
**2. Via Direct Tool Definitions in Bot**
|
||
|
|
Define tools manually in the bot code:
|
||
|
|
```python
|
||
|
|
tools = [
|
||
|
|
{
|
||
|
|
"type": "function",
|
||
|
|
"function": {
|
||
|
|
"name": "search_web",
|
||
|
|
"description": "Search the web for information",
|
||
|
|
"parameters": {
|
||
|
|
"type": "object",
|
||
|
|
"properties": {
|
||
|
|
"query": {
|
||
|
|
"type": "string",
|
||
|
|
"description": "The search query"
|
||
|
|
}
|
||
|
|
},
|
||
|
|
"required": ["query"]
|
||
|
|
}
|
||
|
|
}
|
||
|
|
}
|
||
|
|
]
|
||
|
|
|
||
|
|
response = client.chat.completions.create(
|
||
|
|
model=MODEL_NAME,
|
||
|
|
messages=messages,
|
||
|
|
tools=tools,
|
||
|
|
tool_choice="auto"
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
#### Tool Execution Flow
|
||
|
|
|
||
|
|
1. **Send message with tools available**:
|
||
|
|
```python
|
||
|
|
response = client.chat.completions.create(
|
||
|
|
model=MODEL_NAME,
|
||
|
|
messages=messages,
|
||
|
|
tools=available_tools
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
2. **Check if model wants to use a tool**:
|
||
|
|
```python
|
||
|
|
if response.choices[0].message.tool_calls:
|
||
|
|
for tool_call in response.choices[0].message.tool_calls:
|
||
|
|
function_name = tool_call.function.name
|
||
|
|
arguments = json.loads(tool_call.function.arguments)
|
||
|
|
# Execute the function
|
||
|
|
result = execute_tool(function_name, arguments)
|
||
|
|
```
|
||
|
|
|
||
|
|
3. **Send tool results back to model**:
|
||
|
|
```python
|
||
|
|
messages.append({
|
||
|
|
"role": "assistant",
|
||
|
|
"content": None,
|
||
|
|
"tool_calls": response.choices[0].message.tool_calls
|
||
|
|
})
|
||
|
|
messages.append({
|
||
|
|
"role": "tool",
|
||
|
|
"content": json.dumps(result),
|
||
|
|
"tool_call_id": tool_call.id
|
||
|
|
})
|
||
|
|
|
||
|
|
# Get final response
|
||
|
|
final_response = client.chat.completions.create(
|
||
|
|
model=MODEL_NAME,
|
||
|
|
messages=messages,
|
||
|
|
tools=available_tools
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
4. **Return final response to user**
|
||
|
|
|
||
|
|
#### Tool Implementation Patterns
|
||
|
|
|
||
|
|
**Pattern 1: Bot-Managed Tools**
|
||
|
|
Implement tools directly in the bot:
|
||
|
|
```python
|
||
|
|
async def search_web(query: str) -> str:
|
||
|
|
"""Execute web search"""
|
||
|
|
# Use requests/aiohttp to call search API
|
||
|
|
pass
|
||
|
|
|
||
|
|
async def get_weather(location: str) -> str:
|
||
|
|
"""Get weather for location"""
|
||
|
|
# Call weather API
|
||
|
|
pass
|
||
|
|
|
||
|
|
AVAILABLE_TOOLS = {
|
||
|
|
"search_web": search_web,
|
||
|
|
"get_weather": get_weather,
|
||
|
|
}
|
||
|
|
|
||
|
|
async def execute_tool(name: str, arguments: dict) -> str:
|
||
|
|
if name in AVAILABLE_TOOLS:
|
||
|
|
return await AVAILABLE_TOOLS[name](**arguments)
|
||
|
|
return "Tool not found"
|
||
|
|
```
|
||
|
|
|
||
|
|
**Pattern 2: MCP Server Proxy**
|
||
|
|
Let LiteLLM proxy handle MCP servers (recommended):
|
||
|
|
- Configure MCP servers in LiteLLM config
|
||
|
|
- LiteLLM automatically exposes them as tools
|
||
|
|
- Bot just passes tool calls through
|
||
|
|
- Simpler bot code, more scalable
|
||
|
|
|
||
|
|
**Pattern 3: Hybrid**
|
||
|
|
- Common tools via LiteLLM proxy MCP
|
||
|
|
- Discord-specific tools in bot (e.g., "get_server_info", "list_channels")
|
||
|
|
|
||
|
|
#### Recommended Starter Tools
|
||
|
|
|
||
|
|
1. **Web Search** (via Brave/Google MCP server)
|
||
|
|
- Let bot search for current information
|
||
|
|
|
||
|
|
2. **File Operations** (via filesystem MCP server - with restrictions!)
|
||
|
|
- Read documentation, configs
|
||
|
|
- Useful in developer-focused servers
|
||
|
|
|
||
|
|
3. **Wikipedia** (via wikipedia MCP server)
|
||
|
|
- Factual information lookup
|
||
|
|
|
||
|
|
4. **Time/Date** (custom function)
|
||
|
|
- Simple, no external dependency
|
||
|
|
|
||
|
|
5. **Discord Server Info** (custom function)
|
||
|
|
- Get channel list, member count, server info
|
||
|
|
- Discord-specific utility
|
||
|
|
|
||
|
|
### Phase 4: Improve Message History Management
|
||
|
|
|
||
|
|
#### Current Issues
|
||
|
|
- Fetches all messages every time (inefficient)
|
||
|
|
- No conversation threading (treats all channel messages as one context)
|
||
|
|
- No token limit awareness
|
||
|
|
- Channel history might contain irrelevant conversations
|
||
|
|
|
||
|
|
#### Improvements
|
||
|
|
|
||
|
|
**1. Per-Conversation Threading**
|
||
|
|
```python
|
||
|
|
# Track conversations by thread or by user
|
||
|
|
conversation_storage = {
|
||
|
|
"channel_id:user_id": [
|
||
|
|
{"role": "user", "content": "..."},
|
||
|
|
{"role": "assistant", "content": "..."},
|
||
|
|
]
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**2. Token-Aware History Truncation**
|
||
|
|
```python
|
||
|
|
def trim_history(messages, max_tokens=4000):
|
||
|
|
"""Keep only recent messages that fit in token budget"""
|
||
|
|
# Use tiktoken to count tokens
|
||
|
|
# Remove oldest messages until under limit
|
||
|
|
pass
|
||
|
|
```
|
||
|
|
|
||
|
|
**3. Message Deduplication**
|
||
|
|
Only include messages directly related to bot conversations:
|
||
|
|
- Messages mentioning bot
|
||
|
|
- Bot's responses
|
||
|
|
- Optionally: X messages before each bot mention for context
|
||
|
|
|
||
|
|
**4. Caching & Persistence**
|
||
|
|
- Cache conversation history in memory
|
||
|
|
- Optional: Persist to SQLite/Redis for bot restarts
|
||
|
|
- Clear old conversations after inactivity
|
||
|
|
|
||
|
|
## Implementation Checklist
|
||
|
|
|
||
|
|
### Preparation
|
||
|
|
- [ ] Set up LiteLLM proxy locally or remotely
|
||
|
|
- [ ] Configure LiteLLM with desired model(s)
|
||
|
|
- [ ] Decide on MCP servers to enable
|
||
|
|
- [ ] Design system prompt strategy
|
||
|
|
- [ ] Review token limits for target models
|
||
|
|
|
||
|
|
### Code Changes
|
||
|
|
|
||
|
|
#### File: v2/bot.py
|
||
|
|
- [ ] Update imports (add `json`, improve `aiohttp` usage)
|
||
|
|
- [ ] Change environment variables:
|
||
|
|
- [ ] `OPENWEBUI_API_BASE` → `LITELLM_API_BASE`
|
||
|
|
- [ ] Add `SYSTEM_PROMPT` or `SYSTEM_PROMPT_FILE`
|
||
|
|
- [ ] Update OpenAI client initialization
|
||
|
|
- [ ] Refactor `get_ai_response()`:
|
||
|
|
- [ ] Add system message
|
||
|
|
- [ ] Convert history to proper message format (alternating user/assistant)
|
||
|
|
- [ ] Add tool support parameters
|
||
|
|
- [ ] Implement tool execution loop
|
||
|
|
- [ ] Refactor `get_chat_history()`:
|
||
|
|
- [ ] Return structured messages instead of text concatenation
|
||
|
|
- [ ] Filter for bot-relevant messages
|
||
|
|
- [ ] Add token counting/truncation
|
||
|
|
- [ ] Fix `download_image()` to use aiohttp instead of requests
|
||
|
|
- [ ] Add tool definition functions
|
||
|
|
- [ ] Add tool execution handler
|
||
|
|
- [ ] Add error handling for tool failures
|
||
|
|
|
||
|
|
#### New File: v2/tools.py (optional)
|
||
|
|
- [ ] Define tool schemas
|
||
|
|
- [ ] Implement tool execution functions
|
||
|
|
- [ ] Export tool registry
|
||
|
|
|
||
|
|
#### New File: v2/system_prompt.txt or system_prompts.json
|
||
|
|
- [ ] Write default system prompt
|
||
|
|
- [ ] Optional: Add per-guild prompts
|
||
|
|
|
||
|
|
#### File: v2/requirements.txt
|
||
|
|
- [ ] Keep: `discord.py`, `openai`, `python-dotenv`
|
||
|
|
- [ ] Add: `aiohttp` (if not using requests), `tiktoken` (for token counting)
|
||
|
|
- [ ] Optional: `anthropic` (if using Claude directly), `litellm` (if using SDK directly)
|
||
|
|
|
||
|
|
#### File: v2/.env.example
|
||
|
|
- [ ] Update variable names
|
||
|
|
- [ ] Add system prompt variables
|
||
|
|
- [ ] Document new configuration options
|
||
|
|
|
||
|
|
### Testing
|
||
|
|
- [ ] Test basic message responses (no tools)
|
||
|
|
- [ ] Test with images attached
|
||
|
|
- [ ] Test tool calling with simple tool (e.g., get_time)
|
||
|
|
- [ ] Test tool calling with external MCP server
|
||
|
|
- [ ] Test conversation threading
|
||
|
|
- [ ] Test token limit handling
|
||
|
|
- [ ] Test error scenarios (API down, tool failure, etc.)
|
||
|
|
- [ ] Test in multiple Discord servers/channels
|
||
|
|
|
||
|
|
### Documentation
|
||
|
|
- [ ] Update README.md with new setup instructions
|
||
|
|
- [ ] Document LiteLLM proxy setup
|
||
|
|
- [ ] Document MCP server configuration
|
||
|
|
- [ ] Add example system prompts
|
||
|
|
- [ ] Document available tools
|
||
|
|
- [ ] Add troubleshooting section
|
||
|
|
|
||
|
|
## Technical Considerations
|
||
|
|
|
||
|
|
### Token Management
|
||
|
|
- Most models have 4k-128k token context windows
|
||
|
|
- Message history can quickly consume tokens
|
||
|
|
- Reserve tokens for:
|
||
|
|
- System prompt: ~500-1000 tokens
|
||
|
|
- Tool definitions: ~100-500 tokens per tool
|
||
|
|
- Response: ~1000-2000 tokens
|
||
|
|
- History: remaining tokens
|
||
|
|
|
||
|
|
### Rate Limiting
|
||
|
|
- Discord: 5 requests per 5 seconds per channel
|
||
|
|
- LLM APIs: Varies by provider (OpenAI: ~3500 RPM for GPT-4)
|
||
|
|
- Implement queuing if needed
|
||
|
|
|
||
|
|
### Error Handling
|
||
|
|
- API timeouts: Retry with exponential backoff
|
||
|
|
- Tool execution failures: Return error message to model
|
||
|
|
- Discord API errors: Log and notify user
|
||
|
|
- Invalid tool calls: Validate before execution
|
||
|
|
|
||
|
|
### Security Considerations
|
||
|
|
- **Tool access control**: Don't expose dangerous tools (file delete, system commands)
|
||
|
|
- **Input validation**: Sanitize tool arguments
|
||
|
|
- **Rate limiting**: Prevent abuse of expensive tools (web search)
|
||
|
|
- **API key security**: Never log or expose API keys
|
||
|
|
- **MCP filesystem access**: Restrict to safe directories only
|
||
|
|
|
||
|
|
### Cost Optimization
|
||
|
|
- Use smaller models for simple queries (gpt-3.5-turbo)
|
||
|
|
- Implement streaming for better UX
|
||
|
|
- Cache common queries
|
||
|
|
- Trim history aggressively
|
||
|
|
- Consider LiteLLM's caching features
|
||
|
|
|
||
|
|
## Future Enhancements
|
||
|
|
|
||
|
|
### Short Term
|
||
|
|
- [ ] Add slash commands for bot configuration
|
||
|
|
- [ ] Implement conversation reset command
|
||
|
|
- [ ] Add support for Discord threads
|
||
|
|
- [ ] Stream responses for long outputs
|
||
|
|
- [ ] Add reaction-based tool approval (user confirms before execution)
|
||
|
|
|
||
|
|
### Medium Term
|
||
|
|
- [ ] Multi-modal support (voice, more image formats)
|
||
|
|
- [ ] Per-user conversation isolation
|
||
|
|
- [ ] Tool usage analytics and logging
|
||
|
|
- [ ] Custom MCP server for Discord-specific tools
|
||
|
|
- [ ] Web dashboard for bot management
|
||
|
|
|
||
|
|
### Long Term
|
||
|
|
- [ ] Agentic workflows (multi-step tool usage)
|
||
|
|
- [ ] Memory/RAG for long-term context
|
||
|
|
- [ ] Multiple bot personalities per server
|
||
|
|
- [ ] Integration with Discord's scheduled events
|
||
|
|
- [ ] Voice channel integration (TTS/STT)
|
||
|
|
|
||
|
|
## Resources
|
||
|
|
|
||
|
|
### Documentation
|
||
|
|
- **LiteLLM Docs**: https://docs.litellm.ai/
|
||
|
|
- **LiteLLM Tools/Functions**: https://docs.litellm.ai/docs/completion/function_call
|
||
|
|
- **MCP Specification**: https://modelcontextprotocol.io/
|
||
|
|
- **MCP Server Examples**: https://github.com/modelcontextprotocol/servers
|
||
|
|
- **Discord.py Docs**: https://discordpy.readthedocs.io/
|
||
|
|
- **OpenAI API Docs**: https://platform.openai.com/docs/guides/function-calling
|
||
|
|
|
||
|
|
### Example MCP Servers
|
||
|
|
- `@modelcontextprotocol/server-filesystem`: File operations
|
||
|
|
- `@modelcontextprotocol/server-github`: GitHub integration
|
||
|
|
- `@modelcontextprotocol/server-postgres`: Database queries
|
||
|
|
- `@modelcontextprotocol/server-brave-search`: Web search
|
||
|
|
- `@modelcontextprotocol/server-slack`: Slack integration
|
||
|
|
- `@modelcontextprotocol/server-memory`: Persistent memory
|
||
|
|
|
||
|
|
### Tools for Development
|
||
|
|
- **tiktoken**: Token counting (OpenAI tokenizer)
|
||
|
|
- **litellm CLI**: `litellm --model gpt-4 --drop_params` for testing
|
||
|
|
- **Postman**: Test LiteLLM API endpoints
|
||
|
|
- **Docker**: Containerize LiteLLM proxy
|
||
|
|
|
||
|
|
## Questions to Resolve
|
||
|
|
|
||
|
|
1. **Which LiteLLM deployment?**
|
||
|
|
- Self-hosted proxy (more control, more maintenance)
|
||
|
|
- Hosted service (easier, potential cost)
|
||
|
|
|
||
|
|
2. **Which models to support?**
|
||
|
|
- Single model (simpler)
|
||
|
|
- Multiple models with fallback (more robust)
|
||
|
|
- User-selectable models (more flexible)
|
||
|
|
|
||
|
|
3. **MCP server hosting?**
|
||
|
|
- Same machine as bot
|
||
|
|
- Separate server
|
||
|
|
- Cloud functions
|
||
|
|
|
||
|
|
4. **System prompt strategy?**
|
||
|
|
- Single global prompt
|
||
|
|
- Per-guild prompts
|
||
|
|
- User-configurable
|
||
|
|
|
||
|
|
5. **Tool approval flow?**
|
||
|
|
- Automatic execution (faster but riskier)
|
||
|
|
- User confirmation for sensitive tools (safer but slower)
|
||
|
|
|
||
|
|
6. **Conversation persistence?**
|
||
|
|
- In-memory only (simple, lost on restart)
|
||
|
|
- SQLite (persistent, moderate complexity)
|
||
|
|
- Redis (distributed, more setup)
|
||
|
|
|
||
|
|
## Current Code Analysis
|
||
|
|
|
||
|
|
### v2/bot.py Strengths
|
||
|
|
- Clean, simple structure
|
||
|
|
- Proper async/await usage
|
||
|
|
- Good image handling
|
||
|
|
- Type hints in newer version
|
||
|
|
|
||
|
|
### v2/bot.py Issues to Fix
|
||
|
|
- Line 44: Using synchronous `requests.get()` in async function
|
||
|
|
- Lines 62-77: Embedding history in user message instead of proper conversation format
|
||
|
|
- Line 41: `channel_history` dict declared but never used
|
||
|
|
- No error handling for OpenAI API errors besides generic try/catch
|
||
|
|
- No rate limiting
|
||
|
|
- No conversation threading
|
||
|
|
- History includes ALL channel messages, not just bot-relevant ones
|
||
|
|
- No system prompt support
|
||
|
|
|
||
|
|
### scripts/discordbot.py Differences
|
||
|
|
- Has system message (line 67) - better approach!
|
||
|
|
- Slightly different message structure
|
||
|
|
- Otherwise similar implementation
|
||
|
|
|
||
|
|
## Recommended Migration Path
|
||
|
|
|
||
|
|
**Step 1**: Quick wins (minimal changes)
|
||
|
|
1. Add system prompt support using `scripts/discordbot.py` pattern
|
||
|
|
2. Fix async image download (use aiohttp)
|
||
|
|
3. Update env vars and client to point to LiteLLM
|
||
|
|
|
||
|
|
**Step 2**: Core refactor (moderate changes)
|
||
|
|
1. Refactor message history to proper conversation format
|
||
|
|
2. Implement token-aware history truncation
|
||
|
|
3. Add basic tool support infrastructure
|
||
|
|
|
||
|
|
**Step 3**: Tool integration (significant changes)
|
||
|
|
1. Define initial tool set
|
||
|
|
2. Implement tool execution loop
|
||
|
|
3. Add error handling for tool failures
|
||
|
|
|
||
|
|
**Step 4**: Polish (incremental improvements)
|
||
|
|
1. Add slash commands for configuration
|
||
|
|
2. Improve conversation management
|
||
|
|
3. Add monitoring and logging
|
||
|
|
|
||
|
|
This approach allows you to test at each step and provides incremental value.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Getting Started
|
||
|
|
|
||
|
|
When you're ready to begin implementation:
|
||
|
|
|
||
|
|
1. **Set up LiteLLM proxy**:
|
||
|
|
```bash
|
||
|
|
pip install litellm
|
||
|
|
litellm --model gpt-4 --drop_params
|
||
|
|
# Or use Docker: docker run -p 4000:4000 ghcr.io/berriai/litellm:main
|
||
|
|
```
|
||
|
|
|
||
|
|
2. **Test LiteLLM endpoint**:
|
||
|
|
```bash
|
||
|
|
curl -X POST http://localhost:4000/v1/chat/completions \
|
||
|
|
-H "Content-Type: application/json" \
|
||
|
|
-d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}]}'
|
||
|
|
```
|
||
|
|
|
||
|
|
3. **Start with system prompt**: Implement system prompt support first as low-risk improvement
|
||
|
|
|
||
|
|
4. **Iterate on tools**: Start with one simple tool, then expand
|
||
|
|
|
||
|
|
Let me know which phase you'd like to tackle first!
|