Add MCP tools integration for Discord bot

Major improvements to LiteLLM Discord bot with MCP (Model Context Protocol) tools support: Features added: - MCP tools discovery and integration with LiteLLM proxy - Fetch and convert 40+ GitHub MCP tools to OpenAI format - Tool calling flow with placeholder execution (pending MCP endpoint confirmation) - Dynamic tool injection based on LiteLLM MCP server configuration - Enhanced system prompt with tool usage guidance - Added ENABLE_TOOLS environment variable for easy toggle - Comprehensive debug logging for troubleshooting Technical changes: - Added httpx>=0.25.0 dependency for async MCP API calls - Implemented get_available_mcp_tools() to query /v1/mcp/server and /v1/mcp/tools endpoints - Convert MCP tool schemas to OpenAI function calling format - Detect and handle tool_calls in model responses - Added system_prompt.txt for customizable bot behavior - Updated README with better documentation and setup instructions - Created claude.md with detailed development notes and upgrade roadmap Configuration: - New ENABLE_TOOLS flag in .env to control MCP integration - DEBUG_LOGGING for detailed execution logs - System prompt file support for easy customization Known limitations: - Tool execution currently uses placeholders (MCP execution endpoint needs verification) - Limited to 50 tools to avoid overwhelming the model - Requires LiteLLM proxy with MCP server configured Next steps: - Verify correct LiteLLM MCP tool execution endpoint - Implement actual tool execution via MCP proxy - Test end-to-end GitHub operations through Discord 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-10 11:26:01 -08:00
parent 82fc9ea5f9
commit 408028c36e
9 changed files with 1323 additions and 87 deletions
--- a/12
+++ b/12
@@ -1,5 +1,5 @@
 # Use Python 3 base image
-FROM python:3.9-slim
+FROM python:3.11-slim

 # Set working directory
 WORKDIR /app
@@ -10,11 +10,15 @@ RUN apt-get update && apt-get install -y \
    libopus0 \
    && rm -rf /var/lib/apt/lists/*

-# Install required packages
-RUN pip install --no-cache-dir discord.py python-dotenv openai requests asyncio
+# Copy requirements file
+COPY /scripts/requirements.txt .

-# Copy the bot script
+# Install required packages from requirements.txt
+RUN pip install --no-cache-dir -r requirements.txt
+
+# Copy the bot script and system prompt
 COPY /scripts/discordbot.py .
+COPY /scripts/system_prompt.txt .

 # Run the bot on container start
 CMD ["python", "discordbot.py"]
--- a/README.md
+++ b/README.md
@@ -1,23 +1,214 @@
-# OpenWebUI-Discordbot
+# LiteLLM Discord Bot

-A Discord bot that interfaces with an OpenWebUI instance to provide AI-powered responses in your Discord server.
+A Discord bot that interfaces with LiteLLM proxy to provide AI-powered responses in your Discord server. Supports multiple LLM providers through LiteLLM, conversation history management, image analysis, and configurable system prompts.
+
+## Features
+
+- 🤖 **LiteLLM Integration**: Use any LLM provider supported by LiteLLM (OpenAI, Anthropic, Google, local models, etc.)
+- 💬 **Conversation History**: Intelligent message history with token-aware truncation
+- 🖼️ **Image Support**: Analyze images attached to messages (for vision-capable models)
+- ⚙️ **Configurable System Prompts**: Customize bot behavior via file-based prompts
+- 🔄 **Async Architecture**: Efficient async/await design for responsive interactions
+- 🐳 **Docker Support**: Easy deployment with Docker

 ## Prerequisites

- Docker (for containerized deployment)
- Python 3.8 or higher+ (for local development)
- A Discord Bot Token ([How to create a Discord Bot Token](https://www.writebots.com/discord-bot-token/))
- Access to an OpenWebUI instance
+- **Python 3.11+** (for local development) or **Docker** (for containerized deployment)
+- **Discord Bot Token** ([How to create one](https://www.writebots.com/discord-bot-token/))
+- **LiteLLM Proxy** instance running ([LiteLLM setup guide](https://docs.litellm.ai/docs/proxy/quick_start))

-## Installation
+## Quick Start

-##### Running locally
+### Option 1: Running with Docker (Recommended)

-1. Clone the repository
-2. Copy `.env.sample` to `.env` and configure your environment variables:
-```env
-   DISCORD_TOKEN=your_discord_bot_token
-   OPENAI_API_KEY=your_openwebui_api_key
-   OPENWEBUI_API_BASE=http://your_openwebui_instance:port/api
-   MODEL_NAME=your_model_name
+1. Clone the repository:
+```bash
+git clone <repository-url>
+cd OpenWebUI-Discordbot
 ```
+
+2. Configure environment variables:
+```bash
+cd scripts
+cp .env.sample .env
+# Edit .env with your actual values
+```
+
+3. Build and run with Docker:
+```bash
+docker build -t discord-bot .
+docker run --env-file scripts/.env discord-bot
+```
+
+### Option 2: Running Locally
+
+1. Clone the repository and navigate to scripts directory:
+```bash
+git clone <repository-url>
+cd OpenWebUI-Discordbot/scripts
+```
+
+2. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
+
+3. Copy and configure environment variables:
+```bash
+cp .env.sample .env
+# Edit .env with your configuration
+```
+
+4. Run the bot:
+```bash
+python discordbot.py
+```
+
+## Configuration
+
+### Environment Variables
+
+Create a `.env` file in the `scripts/` directory with the following variables:
+
+```env
+# Discord Bot Token - Get from https://discord.com/developers/applications
+DISCORD_TOKEN=your_discord_bot_token
+
+# LiteLLM API Configuration
+LITELLM_API_KEY=sk-1234
+LITELLM_API_BASE=http://localhost:4000
+
+# Model name (any model supported by your LiteLLM proxy)
+MODEL_NAME=gpt-4-turbo-preview
+
+# System Prompt Configuration (optional)
+SYSTEM_PROMPT_FILE=./system_prompt.txt
+
+# Maximum tokens to use for conversation history (optional, default: 3000)
+MAX_HISTORY_TOKENS=3000
+```
+
+### System Prompt Customization
+
+The bot's behavior is controlled by a system prompt file. Edit `scripts/system_prompt.txt` to customize how the bot responds:
+
+```txt
+You are a helpful AI assistant integrated into Discord. Users will interact with you by mentioning you or sending direct messages.
+
+Key behaviors:
+- Be concise and friendly in your responses
+- Use Discord markdown formatting when helpful (code blocks, bold, italics, etc.)
+- When users attach images, analyze them and provide relevant insights
+...
+```
+
+## Setting Up LiteLLM Proxy
+
+### Quick Setup (Local)
+
+1. Install LiteLLM:
+```bash
+pip install litellm
+```
+
+2. Run the proxy:
+```bash
+litellm --model gpt-4-turbo-preview --api_key YOUR_OPENAI_KEY
+# Or for local models:
+litellm --model ollama/llama3.2-vision
+```
+
+### Production Setup (Docker)
+
+```bash
+docker run -p 4000:4000 \
+  -e OPENAI_API_KEY=your_key \
+  ghcr.io/berriai/litellm:main-latest
+```
+
+For advanced configuration, create a `litellm_config.yaml`:
+```yaml
+model_list:
+  - model_name: gpt-4-turbo
+    litellm_params:
+      model: gpt-4-turbo-preview
+      api_key: os.environ/OPENAI_API_KEY
+  - model_name: claude
+    litellm_params:
+      model: claude-3-sonnet-20240229
+      api_key: os.environ/ANTHROPIC_API_KEY
+```
+
+Then run:
+```bash
+litellm --config litellm_config.yaml
+```
+
+See [LiteLLM documentation](https://docs.litellm.ai/) for more details.
+
+## Usage
+
+### Triggering the Bot
+
+The bot responds to:
+- **@mentions** in any channel where it has read access
+- **Direct messages (DMs)**
+
+Example:
+```
+User: @BotName what's the weather like?
+Bot: I don't have access to real-time weather data, but I can help you with other questions!
+```
+
+### Image Analysis
+
+Attach images to your message (requires vision-capable model):
+```
+User: @BotName what's in this image? [image.png]
+Bot: The image shows a beautiful sunset over the ocean with...
+```
+
+### Message History
+
+The bot automatically maintains conversation context:
+- Retrieves recent relevant messages from the channel
+- Limits history based on token count (configurable via `MAX_HISTORY_TOKENS`)
+- Only includes messages where the bot was mentioned or bot's own responses
+
+## Architecture Overview
+
+### Key Improvements from OpenWebUI Version
+
+1. **LiteLLM Integration**: Switched from OpenWebUI to LiteLLM for broader model support
+2. **Proper Conversation Format**: Messages use correct role attribution (system/user/assistant)
+3. **Token-Aware History**: Intelligent truncation to stay within model context limits
+4. **Async Image Downloads**: Uses `aiohttp` instead of synchronous `requests`
+5. **File-Based System Prompts**: Easy customization without code changes
+6. **Better Error Handling**: Improved error messages and validation
+
+### Project Structure
+
+```
+OpenWebUI-Discordbot/
+├── scripts/
+│   ├── discordbot.py          # Main bot code (production)
+│   ├── system_prompt.txt      # System prompt configuration
+│   ├── requirements.txt       # Python dependencies
+│   └── .env.sample           # Environment variable template
+├── v2/
+│   └── bot.py                # Development/experimental version
+├── Dockerfile                # Docker containerization
+├── README.md                 # This file
+└── claude.md                 # Development roadmap & upgrade notes
+```
+
+## Upgrading from OpenWebUI
+
+If you're upgrading from the previous OpenWebUI version:
+
+1. **Update environment variables**: Rename `OPENWEBUI_API_BASE` → `LITELLM_API_BASE`, `OPENAI_API_KEY` → `LITELLM_API_KEY`
+2. **Set up LiteLLM proxy**: Follow setup instructions above
+3. **Install new dependencies**: Run `pip install -r requirements.txt`
+4. **Optional**: Customize `system_prompt.txt` for your use case
+
+See `claude.md` for detailed upgrade documentation and future roadmap (MCP tools support, etc.).
--- a/claude.md
+++ b/claude.md
@@ -0,0 +1,649 @@
+# OpenWebUI Discord Bot - Upgrade Project
+
+## Project Overview
+
+This Discord bot currently interfaces with OpenWebUI to provide AI-powered responses. The goal is to upgrade it to:
+1. **Switch from OpenWebUI to LiteLLM Proxy** as the backend
+2. **Add MCP (Model Context Protocol) Tool Support**
+3. **Implement system prompt management within the application**
+
+## Current Architecture
+
+### Files Structure
+- **Main bot**: [v2/bot.py](v2/bot.py) - Current implementation
+- **Legacy bot**: [scripts/discordbot.py](scripts/discordbot.py) - Older version with slightly different approach
+- **Dependencies**: [v2/requirements.txt](v2/requirements.txt)
+- **Config**: [v2/.env.example](v2/.env.example)
+
+### Current Implementation Details
+
+#### Bot Features (v2/bot.py)
+- **Discord Integration**: Uses discord.py with message intents
+- **Trigger Methods**:
+  - Bot mentions (@bot)
+  - Direct messages (DMs)
+- **Message History**: Retrieves last 100 messages for context using `get_chat_history()`
+- **Image Support**: Downloads and encodes images as base64, sends to API
+- **API Client**: Uses OpenAI Python SDK pointing to OpenWebUI endpoint
+- **Message Format**: Embeds chat history in user message context
+
+#### Current Message Flow
+1. User mentions bot or DMs it
+2. Bot fetches channel history (last 100 messages)
+3. Formats history as: `"AuthorName: message content"`
+4. Sends to OpenWebUI with format:
+   ```python
+   {
+     "role": "user",
+     "content": [
+       {"type": "text", "text": "##CONTEXT##\n{history}\n##ENDCONTEXT##\n\n{user_message}"},
+       {"type": "image_url", "image_url": {...}}  # if images present
+     ]
+   }
+   ```
+5. Returns AI response and replies to user
+
+#### Current Limitations
+- **No system prompt**: Context is embedded in user messages
+- **No tool calling**: Cannot execute functions or use MCPs
+- **OpenWebUI dependency**: Tightly coupled to OpenWebUI API structure
+- **Simple history**: Just text concatenation, no proper conversation threading
+- **Synchronous image download**: Uses `requests.get()` in async context (should use aiohttp)
+
+## Target Architecture: LiteLLM + MCP Tools
+
+### Why LiteLLM?
+
+LiteLLM is a unified proxy that:
+- **Standardizes API calls** across 100+ LLM providers (OpenAI, Anthropic, Google, etc.)
+- **Native tool/function calling support** via OpenAI-compatible API
+- **Built-in MCP support** for Model Context Protocol tools
+- **Load balancing** and fallback between models
+- **Cost tracking** and usage analytics
+- **Streaming support** for real-time responses
+
+### LiteLLM Tool Calling
+
+LiteLLM supports the OpenAI tools format:
+```python
+response = client.chat.completions.create(
+    model="gpt-4",
+    messages=[...],
+    tools=[{
+        "type": "function",
+        "function": {
+            "name": "get_weather",
+            "description": "Get current weather",
+            "parameters": {...}
+        }
+    }],
+    tool_choice="auto"
+)
+```
+
+### MCP (Model Context Protocol) Overview
+
+MCP is a standard protocol for:
+- **Exposing tools** to LLMs (functions they can call)
+- **Providing resources** (files, APIs, databases)
+- **Prompts/templates** for consistent interactions
+- **Sampling** for multi-step agentic behavior
+
+**MCP Server Examples**:
+- `filesystem`: Read/write files
+- `github`: Access repos, create PRs
+- `postgres`: Query databases
+- `brave-search`: Web search
+- `slack`: Send messages, read channels
+
+## Upgrade Plan
+
+### Phase 1: Switch to LiteLLM Proxy
+
+#### Configuration Changes
+1. Update environment variables:
+   ```env
+   DISCORD_TOKEN=your_discord_bot_token
+   LITELLM_API_KEY=your_litellm_api_key
+   LITELLM_API_BASE=http://localhost:4000  # or your LiteLLM proxy URL
+   MODEL_NAME=gpt-4-turbo-preview  # or any LiteLLM-supported model
+   SYSTEM_PROMPT=your_default_system_prompt  # New!
+   ```
+
+2. Keep using OpenAI SDK (LiteLLM is OpenAI-compatible):
+   ```python
+   from openai import OpenAI
+
+   client = OpenAI(
+       api_key=os.getenv('LITELLM_API_KEY'),
+       base_url=os.getenv('LITELLM_API_BASE')
+   )
+   ```
+
+#### Message Format Refactor
+**Current approach** (embedding context in user message):
+```python
+text_content = f"##CONTEXT##\n{context}\n##ENDCONTEXT##\n\n{user_message}"
+messages = [{"role": "user", "content": text_content}]
+```
+
+**New approach** (proper conversation history):
+```python
+messages = [
+    {"role": "system", "content": SYSTEM_PROMPT},
+    # ... previous conversation messages with proper roles ...
+    {"role": "user", "content": user_message}
+]
+```
+
+#### Benefits
+- Better model understanding of conversation structure
+- Separate system instructions from conversation
+- Proper role attribution (user vs assistant)
+- More efficient token usage
+
+### Phase 2: Add System Prompt Management
+
+#### Implementation Options
+
+**Option A: Simple Environment Variable**
+- Store in `.env` file
+- Good for: Single, static system prompt
+- Example: `SYSTEM_PROMPT="You are a helpful Discord assistant..."`
+
+**Option B: File-Based System Prompt**
+- Store in separate file (e.g., `system_prompt.txt`)
+- Good for: Long, complex prompts that need version control
+- Hot-reload capability
+
+**Option C: Per-Channel/Per-Guild Prompts**
+- Store in JSON/database mapping channel_id → system_prompt
+- Good for: Multi-tenant bot with different personalities per server
+- Example:
+  ```json
+  {
+    "123456789": "You are a coding assistant...",
+    "987654321": "You are a gaming buddy..."
+  }
+  ```
+
+**Option D: User-Configurable Prompts**
+- Discord slash commands to set/view system prompt
+- Store in SQLite/JSON
+- Commands: `/setprompt`, `/viewprompt`, `/resetprompt`
+
+**Recommended**: Start with Option B (file-based), add Option D later for flexibility.
+
+#### System Prompt Best Practices
+1. **Define bot personality**: Tone, style, formality
+2. **Set boundaries**: What bot should/shouldn't do
+3. **Provide context**: "You are in a Discord server, users will mention you"
+4. **Handle images**: "When users attach images, describe them..."
+5. **Tool usage guidance**: "Use available tools when appropriate"
+
+Example system prompt:
+```
+You are a helpful AI assistant integrated into Discord. Users will interact with you by mentioning you or sending direct messages.
+
+Key behaviors:
+- Be concise and friendly
+- Use Discord markdown formatting when helpful (code blocks, bold, etc.)
+- When users attach images, analyze them and provide relevant insights
+- You have access to various tools - use them when they would help answer the user's question
+- If you're unsure about something, say so
+- Keep track of conversation context
+
+You are not a human, and you should not pretend to be one. Be honest about your capabilities and limitations.
+```
+
+### Phase 3: Implement MCP Tool Support
+
+#### LiteLLM MCP Integration
+
+LiteLLM can connect to MCP servers in two ways:
+
+**1. Via LiteLLM Proxy Configuration**
+Configure in `litellm_config.yaml`:
+```yaml
+model_list:
+  - model_name: gpt-4-with-tools
+    litellm_params:
+      model: gpt-4-turbo-preview
+      api_key: os.environ/OPENAI_API_KEY
+
+mcp_servers:
+  filesystem:
+    command: npx
+    args: [-y, @modelcontextprotocol/server-filesystem, /allowed/path]
+  github:
+    command: npx
+    args: [-y, @modelcontextprotocol/server-github]
+    env:
+      GITHUB_TOKEN: ${GITHUB_TOKEN}
+```
+
+**2. Via Direct Tool Definitions in Bot**
+Define tools manually in the bot code:
+```python
+tools = [
+    {
+        "type": "function",
+        "function": {
+            "name": "search_web",
+            "description": "Search the web for information",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "query": {
+                        "type": "string",
+                        "description": "The search query"
+                    }
+                },
+                "required": ["query"]
+            }
+        }
+    }
+]
+
+response = client.chat.completions.create(
+    model=MODEL_NAME,
+    messages=messages,
+    tools=tools,
+    tool_choice="auto"
+)
+```
+
+#### Tool Execution Flow
+
+1. **Send message with tools available**:
+   ```python
+   response = client.chat.completions.create(
+       model=MODEL_NAME,
+       messages=messages,
+       tools=available_tools
+   )
+   ```
+
+2. **Check if model wants to use a tool**:
+   ```python
+   if response.choices[0].message.tool_calls:
+       for tool_call in response.choices[0].message.tool_calls:
+           function_name = tool_call.function.name
+           arguments = json.loads(tool_call.function.arguments)
+           # Execute the function
+           result = execute_tool(function_name, arguments)
+   ```
+
+3. **Send tool results back to model**:
+   ```python
+   messages.append({
+       "role": "assistant",
+       "content": None,
+       "tool_calls": response.choices[0].message.tool_calls
+   })
+   messages.append({
+       "role": "tool",
+       "content": json.dumps(result),
+       "tool_call_id": tool_call.id
+   })
+
+   # Get final response
+   final_response = client.chat.completions.create(
+       model=MODEL_NAME,
+       messages=messages,
+       tools=available_tools
+   )
+   ```
+
+4. **Return final response to user**
+
+#### Tool Implementation Patterns
+
+**Pattern 1: Bot-Managed Tools**
+Implement tools directly in the bot:
+```python
+async def search_web(query: str) -> str:
+    """Execute web search"""
+    # Use requests/aiohttp to call search API
+    pass
+
+async def get_weather(location: str) -> str:
+    """Get weather for location"""
+    # Call weather API
+    pass
+
+AVAILABLE_TOOLS = {
+    "search_web": search_web,
+    "get_weather": get_weather,
+}
+
+async def execute_tool(name: str, arguments: dict) -> str:
+    if name in AVAILABLE_TOOLS:
+        return await AVAILABLE_TOOLS[name](**arguments)
+    return "Tool not found"
+```
+
+**Pattern 2: MCP Server Proxy**
+Let LiteLLM proxy handle MCP servers (recommended):
+- Configure MCP servers in LiteLLM config
+- LiteLLM automatically exposes them as tools
+- Bot just passes tool calls through
+- Simpler bot code, more scalable
+
+**Pattern 3: Hybrid**
+- Common tools via LiteLLM proxy MCP
+- Discord-specific tools in bot (e.g., "get_server_info", "list_channels")
+
+#### Recommended Starter Tools
+
+1. **Web Search** (via Brave/Google MCP server)
+   - Let bot search for current information
+
+2. **File Operations** (via filesystem MCP server - with restrictions!)
+   - Read documentation, configs
+   - Useful in developer-focused servers
+
+3. **Wikipedia** (via wikipedia MCP server)
+   - Factual information lookup
+
+4. **Time/Date** (custom function)
+   - Simple, no external dependency
+
+5. **Discord Server Info** (custom function)
+   - Get channel list, member count, server info
+   - Discord-specific utility
+
+### Phase 4: Improve Message History Management
+
+#### Current Issues
+- Fetches all messages every time (inefficient)
+- No conversation threading (treats all channel messages as one context)
+- No token limit awareness
+- Channel history might contain irrelevant conversations
+
+#### Improvements
+
+**1. Per-Conversation Threading**
+```python
+# Track conversations by thread or by user
+conversation_storage = {
+    "channel_id:user_id": [
+        {"role": "user", "content": "..."},
+        {"role": "assistant", "content": "..."},
+    ]
+}
+```
+
+**2. Token-Aware History Truncation**
+```python
+def trim_history(messages, max_tokens=4000):
+    """Keep only recent messages that fit in token budget"""
+    # Use tiktoken to count tokens
+    # Remove oldest messages until under limit
+    pass
+```
+
+**3. Message Deduplication**
+Only include messages directly related to bot conversations:
+- Messages mentioning bot
+- Bot's responses
+- Optionally: X messages before each bot mention for context
+
+**4. Caching & Persistence**
+- Cache conversation history in memory
+- Optional: Persist to SQLite/Redis for bot restarts
+- Clear old conversations after inactivity
+
+## Implementation Checklist
+
+### Preparation
+- [ ] Set up LiteLLM proxy locally or remotely
+- [ ] Configure LiteLLM with desired model(s)
+- [ ] Decide on MCP servers to enable
+- [ ] Design system prompt strategy
+- [ ] Review token limits for target models
+
+### Code Changes
+
+#### File: v2/bot.py
+- [ ] Update imports (add `json`, improve `aiohttp` usage)
+- [ ] Change environment variables:
+  - [ ] `OPENWEBUI_API_BASE` → `LITELLM_API_BASE`
+  - [ ] Add `SYSTEM_PROMPT` or `SYSTEM_PROMPT_FILE`
+- [ ] Update OpenAI client initialization
+- [ ] Refactor `get_ai_response()`:
+  - [ ] Add system message
+  - [ ] Convert history to proper message format (alternating user/assistant)
+  - [ ] Add tool support parameters
+  - [ ] Implement tool execution loop
+- [ ] Refactor `get_chat_history()`:
+  - [ ] Return structured messages instead of text concatenation
+  - [ ] Filter for bot-relevant messages
+  - [ ] Add token counting/truncation
+- [ ] Fix `download_image()` to use aiohttp instead of requests
+- [ ] Add tool definition functions
+- [ ] Add tool execution handler
+- [ ] Add error handling for tool failures
+
+#### New File: v2/tools.py (optional)
+- [ ] Define tool schemas
+- [ ] Implement tool execution functions
+- [ ] Export tool registry
+
+#### New File: v2/system_prompt.txt or system_prompts.json
+- [ ] Write default system prompt
+- [ ] Optional: Add per-guild prompts
+
+#### File: v2/requirements.txt
+- [ ] Keep: `discord.py`, `openai`, `python-dotenv`
+- [ ] Add: `aiohttp` (if not using requests), `tiktoken` (for token counting)
+- [ ] Optional: `anthropic` (if using Claude directly), `litellm` (if using SDK directly)
+
+#### File: v2/.env.example
+- [ ] Update variable names
+- [ ] Add system prompt variables
+- [ ] Document new configuration options
+
+### Testing
+- [ ] Test basic message responses (no tools)
+- [ ] Test with images attached
+- [ ] Test tool calling with simple tool (e.g., get_time)
+- [ ] Test tool calling with external MCP server
+- [ ] Test conversation threading
+- [ ] Test token limit handling
+- [ ] Test error scenarios (API down, tool failure, etc.)
+- [ ] Test in multiple Discord servers/channels
+
+### Documentation
+- [ ] Update README.md with new setup instructions
+- [ ] Document LiteLLM proxy setup
+- [ ] Document MCP server configuration
+- [ ] Add example system prompts
+- [ ] Document available tools
+- [ ] Add troubleshooting section
+
+## Technical Considerations
+
+### Token Management
+- Most models have 4k-128k token context windows
+- Message history can quickly consume tokens
+- Reserve tokens for:
+  - System prompt: ~500-1000 tokens
+  - Tool definitions: ~100-500 tokens per tool
+  - Response: ~1000-2000 tokens
+  - History: remaining tokens
+
+### Rate Limiting
+- Discord: 5 requests per 5 seconds per channel
+- LLM APIs: Varies by provider (OpenAI: ~3500 RPM for GPT-4)
+- Implement queuing if needed
+
+### Error Handling
+- API timeouts: Retry with exponential backoff
+- Tool execution failures: Return error message to model
+- Discord API errors: Log and notify user
+- Invalid tool calls: Validate before execution
+
+### Security Considerations
+- **Tool access control**: Don't expose dangerous tools (file delete, system commands)
+- **Input validation**: Sanitize tool arguments
+- **Rate limiting**: Prevent abuse of expensive tools (web search)
+- **API key security**: Never log or expose API keys
+- **MCP filesystem access**: Restrict to safe directories only
+
+### Cost Optimization
+- Use smaller models for simple queries (gpt-3.5-turbo)
+- Implement streaming for better UX
+- Cache common queries
+- Trim history aggressively
+- Consider LiteLLM's caching features
+
+## Future Enhancements
+
+### Short Term
+- [ ] Add slash commands for bot configuration
+- [ ] Implement conversation reset command
+- [ ] Add support for Discord threads
+- [ ] Stream responses for long outputs
+- [ ] Add reaction-based tool approval (user confirms before execution)
+
+### Medium Term
+- [ ] Multi-modal support (voice, more image formats)
+- [ ] Per-user conversation isolation
+- [ ] Tool usage analytics and logging
+- [ ] Custom MCP server for Discord-specific tools
+- [ ] Web dashboard for bot management
+
+### Long Term
+- [ ] Agentic workflows (multi-step tool usage)
+- [ ] Memory/RAG for long-term context
+- [ ] Multiple bot personalities per server
+- [ ] Integration with Discord's scheduled events
+- [ ] Voice channel integration (TTS/STT)
+
+## Resources
+
+### Documentation
+- **LiteLLM Docs**: https://docs.litellm.ai/
+- **LiteLLM Tools/Functions**: https://docs.litellm.ai/docs/completion/function_call
+- **MCP Specification**: https://modelcontextprotocol.io/
+- **MCP Server Examples**: https://github.com/modelcontextprotocol/servers
+- **Discord.py Docs**: https://discordpy.readthedocs.io/
+- **OpenAI API Docs**: https://platform.openai.com/docs/guides/function-calling
+
+### Example MCP Servers
+- `@modelcontextprotocol/server-filesystem`: File operations
+- `@modelcontextprotocol/server-github`: GitHub integration
+- `@modelcontextprotocol/server-postgres`: Database queries
+- `@modelcontextprotocol/server-brave-search`: Web search
+- `@modelcontextprotocol/server-slack`: Slack integration
+- `@modelcontextprotocol/server-memory`: Persistent memory
+
+### Tools for Development
+- **tiktoken**: Token counting (OpenAI tokenizer)
+- **litellm CLI**: `litellm --model gpt-4 --drop_params` for testing
+- **Postman**: Test LiteLLM API endpoints
+- **Docker**: Containerize LiteLLM proxy
+
+## Questions to Resolve
+
+1. **Which LiteLLM deployment?**
+   - Self-hosted proxy (more control, more maintenance)
+   - Hosted service (easier, potential cost)
+
+2. **Which models to support?**
+   - Single model (simpler)
+   - Multiple models with fallback (more robust)
+   - User-selectable models (more flexible)
+
+3. **MCP server hosting?**
+   - Same machine as bot
+   - Separate server
+   - Cloud functions
+
+4. **System prompt strategy?**
+   - Single global prompt
+   - Per-guild prompts
+   - User-configurable
+
+5. **Tool approval flow?**
+   - Automatic execution (faster but riskier)
+   - User confirmation for sensitive tools (safer but slower)
+
+6. **Conversation persistence?**
+   - In-memory only (simple, lost on restart)
+   - SQLite (persistent, moderate complexity)
+   - Redis (distributed, more setup)
+
+## Current Code Analysis
+
+### v2/bot.py Strengths
+- Clean, simple structure
+- Proper async/await usage
+- Good image handling
+- Type hints in newer version
+
+### v2/bot.py Issues to Fix
+- Line 44: Using synchronous `requests.get()` in async function
+- Lines 62-77: Embedding history in user message instead of proper conversation format
+- Line 41: `channel_history` dict declared but never used
+- No error handling for OpenAI API errors besides generic try/catch
+- No rate limiting
+- No conversation threading
+- History includes ALL channel messages, not just bot-relevant ones
+- No system prompt support
+
+### scripts/discordbot.py Differences
+- Has system message (line 67) - better approach!
+- Slightly different message structure
+- Otherwise similar implementation
+
+## Recommended Migration Path
+
+**Step 1**: Quick wins (minimal changes)
+1. Add system prompt support using `scripts/discordbot.py` pattern
+2. Fix async image download (use aiohttp)
+3. Update env vars and client to point to LiteLLM
+
+**Step 2**: Core refactor (moderate changes)
+1. Refactor message history to proper conversation format
+2. Implement token-aware history truncation
+3. Add basic tool support infrastructure
+
+**Step 3**: Tool integration (significant changes)
+1. Define initial tool set
+2. Implement tool execution loop
+3. Add error handling for tool failures
+
+**Step 4**: Polish (incremental improvements)
+1. Add slash commands for configuration
+2. Improve conversation management
+3. Add monitoring and logging
+
+This approach allows you to test at each step and provides incremental value.
+
+---
+
+## Getting Started
+
+When you're ready to begin implementation:
+
+1. **Set up LiteLLM proxy**:
+   ```bash
+   pip install litellm
+   litellm --model gpt-4 --drop_params
+   # Or use Docker: docker run -p 4000:4000 ghcr.io/berriai/litellm:main
+   ```
+
+2. **Test LiteLLM endpoint**:
+   ```bash
+   curl -X POST http://localhost:4000/v1/chat/completions \
+     -H "Content-Type: application/json" \
+     -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}]}'
+   ```
+
+3. **Start with system prompt**: Implement system prompt support first as low-risk improvement
+
+4. **Iterate on tools**: Start with one simple tool, then expand
+
+Let me know which phase you'd like to tackle first!
--- a/scripts/.env.sample
+++ b/scripts/.env.sample
@@ -1,4 +1,24 @@
+# Discord Bot Token - Get from https://discord.com/developers/applications
 DISCORD_TOKEN=your_discord_bot_token
-OPENAI_API_KEY=your_openwebui_api_key
-OPENWEBUI_API_BASE=http://your_openwebui_instance:port/api
-MODEL_NAME="Your_Model_Name"
+
+# LiteLLM API Configuration
+LITELLM_API_KEY=sk-1234
+LITELLM_API_BASE=http://localhost:4000
+
+# Model name (any model supported by your LiteLLM proxy)
+MODEL_NAME=gpt-4-turbo-preview
+
+# System Prompt Configuration (optional)
+SYSTEM_PROMPT_FILE=./system_prompt.txt
+
+# Maximum tokens to use for conversation history (optional, default: 3000)
+MAX_HISTORY_TOKENS=3000
+
+# Enable debug logging (optional, default: false)
+# Set to 'true' to see detailed logs for troubleshooting
+DEBUG_LOGGING=false
+
+# Enable MCP tools integration (optional, default: false)
+# Set to 'true' to allow the bot to use tools configured in your LiteLLM proxy
+# Tools are auto-executed without user confirmation
+ENABLE_TOOLS=false
--- a/scripts/discordbot.py
+++ b/scripts/discordbot.py
@@ -3,33 +3,52 @@ import discord
 from discord.ext import commands
 from openai import OpenAI
 import base64
-import requests
-from io import BytesIO
-from collections import deque
 from dotenv import load_dotenv
-import json
-import datetime
 import aiohttp
 from typing import Dict, Any, List
+import tiktoken
+import httpx

 # Load environment variables
 load_dotenv()

 # Get environment variables
 DISCORD_TOKEN = os.getenv('DISCORD_TOKEN')
-OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
-OPENWEBUI_API_BASE = os.getenv('OPENWEBUI_API_BASE')
+LITELLM_API_KEY = os.getenv('LITELLM_API_KEY')
+LITELLM_API_BASE = os.getenv('LITELLM_API_BASE')
 MODEL_NAME = os.getenv('MODEL_NAME')
+SYSTEM_PROMPT_FILE = os.getenv('SYSTEM_PROMPT_FILE', './system_prompt.txt')
+MAX_HISTORY_TOKENS = int(os.getenv('MAX_HISTORY_TOKENS', '3000'))
+DEBUG_LOGGING = os.getenv('DEBUG_LOGGING', 'false').lower() == 'true'
+ENABLE_TOOLS = os.getenv('ENABLE_TOOLS', 'false').lower() == 'true'

-# Configure OpenAI client to point to OpenWebUI
+def debug_log(message: str):
+    """Print debug message if DEBUG_LOGGING is enabled"""
+    if DEBUG_LOGGING:
+        print(f"[DEBUG] {message}")
+
+# Load system prompt from file
+def load_system_prompt():
+    """Load system prompt from file, with fallback to default"""
+    try:
+        with open(SYSTEM_PROMPT_FILE, 'r', encoding='utf-8') as f:
+            return f.read().strip()
+    except FileNotFoundError:
+        return "You are a helpful AI assistant integrated into Discord."
+
+SYSTEM_PROMPT = load_system_prompt()
+
+# Configure OpenAI client to point to LiteLLM
 client = OpenAI(
-    api_key=os.getenv('OPENAI_API_KEY'),
-    base_url=os.getenv('OPENWEBUI_API_BASE')  # e.g., "http://localhost:8080/v1"
+    api_key=LITELLM_API_KEY,
+    base_url=LITELLM_API_BASE  # e.g., "http://localhost:4000"
 )

-# Configure OpenAI
-# TODO: The 'openai.api_base' option isn't read in the client API. You will need to pass it when you instantiate the client, e.g. 'OpenAI(base_url=OPENWEBUI_API_BASE)'
-# openai.api_base = OPENWEBUI_API_BASE
+# Initialize tokenizer for token counting
+try:
+    encoding = tiktoken.encoding_for_model("gpt-4")
+except KeyError:
+    encoding = tiktoken.get_encoding("cl100k_base")

 # Initialize Discord bot
 intents = discord.Intents.default()
@@ -37,42 +56,255 @@ intents.message_content = True
 intents.messages = True
 bot = commands.Bot(command_prefix='!', intents=intents)

-# Message history cache
-channel_history = {}
+# Message history cache - stores recent conversations per channel
+channel_history: Dict[int, List[Dict[str, Any]]] = {}

-async def download_image(url):
-    response = requests.get(url)
-    if response.status_code == 200:
-        image_data = BytesIO(response.content)
-        base64_image = base64.b64encode(image_data.read()).decode('utf-8')
+def count_tokens(text: str) -> int:
+    """Count tokens in a text string"""
+    try:
+        return len(encoding.encode(text))
+    except Exception:
+        # Fallback: rough estimate (1 token ≈ 4 characters)
+        return len(text) // 4
+
+async def download_image(url: str) -> str | None:
+    """Download image and convert to base64 using async aiohttp"""
+    try:
+        async with aiohttp.ClientSession() as session:
+            async with session.get(url, timeout=aiohttp.ClientTimeout(total=10)) as response:
+                if response.status == 200:
+                    image_data = await response.read()
+                    base64_image = base64.b64encode(image_data).decode('utf-8')
                    return base64_image
+    except Exception as e:
+        print(f"Error downloading image from {url}: {e}")
    return None

-async def get_chat_history(channel, limit=100):
+async def get_available_mcp_tools():
+    """Query LiteLLM for available MCP servers and tools, convert to OpenAI format"""
+    try:
+        base_url = LITELLM_API_BASE.rstrip('/')
+        headers = {"x-litellm-api-key": LITELLM_API_KEY}
+
+        async with httpx.AsyncClient(timeout=30.0) as http_client:
+            # Get MCP server configuration
+            server_response = await http_client.get(
+                f"{base_url}/v1/mcp/server",
+                headers=headers
+            )
+
+            if server_response.status_code == 200:
+                server_info = server_response.json()
+                debug_log(f"MCP server info: found {len(server_info) if isinstance(server_info, list) else 0} servers")
+
+                # Get available MCP tools
+                tools_response = await http_client.get(
+                    f"{base_url}/v1/mcp/tools",
+                    headers=headers
+                )
+
+                if tools_response.status_code == 200:
+                    tools_data = tools_response.json()
+
+                    # Tools come in format: {"tools": [...]}
+                    mcp_tools = tools_data.get("tools", []) if isinstance(tools_data, dict) else tools_data
+                    debug_log(f"Found {len(mcp_tools) if isinstance(mcp_tools, list) else 0} MCP tools")
+
+                    # Convert MCP tools to OpenAI function calling format
+                    openai_tools = []
+                    for tool in mcp_tools[:50]:  # Limit to first 50 tools to avoid overwhelming the model
+                        if isinstance(tool, dict) and "name" in tool:
+                            openai_tool = {
+                                "type": "function",
+                                "function": {
+                                    "name": tool["name"],
+                                    "description": tool.get("description", ""),
+                                    "parameters": tool.get("inputSchema", {})
+                                }
+                            }
+                            openai_tools.append(openai_tool)
+
+                    debug_log(f"Converted {len(openai_tools)} tools to OpenAI format")
+
+                    # Return both server info and converted tools
+                    return {
+                        "server": server_info,
+                        "tools": openai_tools,
+                        "tool_count": len(openai_tools)
+                    }
+                else:
+                    debug_log(f"MCP tools endpoint returned {tools_response.status_code}: {tools_response.text}")
+            else:
+                debug_log(f"MCP server endpoint returned {server_response.status_code}: {server_response.text}")
+
+    except Exception as e:
+        debug_log(f"Error fetching MCP tools: {e}")
+
+    return None
+
+async def get_chat_history(channel, bot_user_id: int, limit: int = 50) -> List[Dict[str, Any]]:
+    """
+    Retrieve chat history and format as proper conversation messages.
+    Only includes messages relevant to bot conversations.
+    Returns list of message dicts with proper role attribution.
+    Supports both regular channels and threads.
+    """
    messages = []
+    total_tokens = 0
+
+    # Check if this is a thread
+    is_thread = isinstance(channel, discord.Thread)
+
+    debug_log(f"Fetching history - is_thread: {is_thread}, channel: {channel.name if hasattr(channel, 'name') else 'DM'}")
+
+    # For threads, we want ALL messages in the thread (not just bot-related)
+    # For channels, we only want bot-related messages
+
+    message_count = 0
+    skipped_system = 0
+
+    # For threads, fetch the context including parent message if it exists
+    if is_thread:
+        try:
+            # Get the starter message (first message in thread)
+            if channel.starter_message:
+                starter = channel.starter_message
+            else:
+                starter = await channel.fetch_message(channel.id)
+
+            # If the starter message is replying to another message, fetch that parent
+            if starter and starter.reference and starter.reference.message_id:
+                try:
+                    parent_message = await channel.parent.fetch_message(starter.reference.message_id)
+                    if parent_message and (parent_message.type == discord.MessageType.default or parent_message.type == discord.MessageType.reply):
+                        is_bot_parent = parent_message.author.id == bot_user_id
+                        role = "assistant" if is_bot_parent else "user"
+                        content = f"{parent_message.author.display_name}: {parent_message.content}" if not is_bot_parent else parent_message.content
+
+                        # Remove bot mention if present
+                        if not is_bot_parent and bot_user_id:
+                            content = content.replace(f'<@{bot_user_id}>', '').strip()
+
+                        msg = {"role": role, "content": content}
+                        msg_tokens = count_tokens(content)
+
+                        if msg_tokens <= MAX_HISTORY_TOKENS:
+                            messages.append(msg)
+                            total_tokens += msg_tokens
+                            message_count += 1
+                            debug_log(f"Added parent message: role={role}, content_preview={content[:50]}...")
+                except Exception as e:
+                    debug_log(f"Could not fetch parent message: {e}")
+
+            # Add the starter message itself
+            if starter and (starter.type == discord.MessageType.default or starter.type == discord.MessageType.reply):
+                is_bot_starter = starter.author.id == bot_user_id
+                role = "assistant" if is_bot_starter else "user"
+                content = f"{starter.author.display_name}: {starter.content}" if not is_bot_starter else starter.content
+
+                # Remove bot mention if present
+                if not is_bot_starter and bot_user_id:
+                    content = content.replace(f'<@{bot_user_id}>', '').strip()
+
+                msg = {"role": role, "content": content}
+                msg_tokens = count_tokens(content)
+
+                if total_tokens + msg_tokens <= MAX_HISTORY_TOKENS:
+                    messages.append(msg)
+                    total_tokens += msg_tokens
+                    message_count += 1
+                    debug_log(f"Added thread starter: role={role}, content_preview={content[:50]}...")
+        except Exception as e:
+            debug_log(f"Could not fetch thread messages: {e}")
+
+    # Fetch history from the channel/thread
    async for message in channel.history(limit=limit):
-        content = f"{message.author.name}: {message.content}"
+        message_count += 1

-        # Handle attachments (images)
-        for attachment in message.attachments:
-            if any(attachment.filename.lower().endswith(ext) for ext in ['.png', '.jpg', '.jpeg', '.gif', '.webp']):
-                content += f" [Image: {attachment.url}]"
+        # Skip system messages (thread starters, pins, etc.)
+        if message.type != discord.MessageType.default and message.type != discord.MessageType.reply:
+            skipped_system += 1
+            debug_log(f"Skipping system message type: {message.type}")
+            continue

-        messages.append(content)
-    return "\n".join(reversed(messages))
+        # Determine if we should include this message
+        is_bot_message = message.author.id == bot_user_id
+        is_bot_mentioned = any(mention.id == bot_user_id for mention in message.mentions)
+        is_dm = isinstance(channel, discord.DMChannel)
+
+        # In threads: include ALL messages for full context
+        # In regular channels: only include bot-related messages
+        # In DMs: include all messages
+        if is_thread or is_dm:
+            should_include = True
+        else:
+            should_include = is_bot_message or is_bot_mentioned
+
+        if not should_include:
+            continue
+
+        # Determine role
+        role = "assistant" if is_bot_message else "user"
+
+        # Build content with author name in threads for multi-user context
+        if is_thread and not is_bot_message:
+            # Include username in threads for clarity
+            content = f"{message.author.display_name}: {message.content}"
+        else:
+            content = message.content
+
+        # Remove bot mention from user messages
+        if not is_bot_message and is_bot_mentioned:
+            content = content.replace(f'<@{bot_user_id}>', '').strip()
+
+        # Note: We'll handle images separately in the main flow
+        # For history, we just note that images were present
+        if message.attachments:
+            image_count = sum(1 for att in message.attachments
+                            if any(att.filename.lower().endswith(ext)
+                                 for ext in ['.png', '.jpg', '.jpeg', '.gif', '.webp']))
+            if image_count > 0:
+                content += f" [attached {image_count} image(s)]"
+
+        # Add to messages with token counting
+        msg = {"role": role, "content": content}
+        msg_tokens = count_tokens(content)
+
+        # Check if adding this message would exceed token limit
+        if total_tokens + msg_tokens > MAX_HISTORY_TOKENS:
+            break
+
+        messages.append(msg)
+        total_tokens += msg_tokens
+        debug_log(f"Added message: role={role}, content_preview={content[:50]}...")
+
+    # Reverse to get chronological order (oldest first)
+    debug_log(f"Processed {message_count} messages, skipped {skipped_system} system messages")
+    debug_log(f"Total messages collected: {len(messages)}, total tokens: {total_tokens}")
+    return list(reversed(messages))


-async def get_ai_response(context, user_message, image_urls=None):
+async def get_ai_response(history_messages: List[Dict[str, Any]], user_message: str, image_urls: List[str] = None) -> str:
+    """
+    Get AI response using LiteLLM with proper conversation history and tool calling support.

-    system_message = f"\"\"\"Previous conversation context:{context}"""
+    Args:
+        history_messages: List of previous conversation messages with roles
+        user_message: Current user message
+        image_urls: Optional list of image URLs to include

-    messages = [
-        {"role": "system", "content": system_message},
-        {"role": "user", "content": [] if image_urls else user_message}
-    ]
+    Returns:
+        AI response string
+    """
+    # Start with system prompt
+    messages = [{"role": "system", "content": SYSTEM_PROMPT}]

-    # Handle messages with images differently
+    # Add conversation history
+    messages.extend(history_messages)
+
+    # Build current user message
    if image_urls:
+        # Multi-modal message with text and images
        content_parts = [{"type": "text", "text": user_message}]

        for url in image_urls:
@@ -84,16 +316,77 @@ async def get_ai_response(context, user_message, image_urls=None):
                        "url": f"data:image/jpeg;base64,{base64_image}"
                    }
                })
-        messages[1]["content"] = content_parts
+        messages.append({"role": "user", "content": content_parts})
+    else:
+        # Text-only message
+        messages.append({"role": "user", "content": user_message})

    try:
-        response = client.chat.completions.create(
-            model=MODEL_NAME,
-            messages=messages
-        )
+        # Build request parameters
+        request_params = {
+            "model": MODEL_NAME,
+            "messages": messages,
+            "temperature": 0.7,
+        }
+
+        # Add MCP tools if enabled
+        if ENABLE_TOOLS:
+            debug_log("Tools enabled - fetching and converting MCP tools")
+
+            # Query and convert MCP tools to OpenAI format
+            mcp_info = await get_available_mcp_tools()
+            if mcp_info and isinstance(mcp_info, dict):
+                openai_tools = mcp_info.get("tools", [])
+                if openai_tools and isinstance(openai_tools, list) and len(openai_tools) > 0:
+                    request_params["tools"] = openai_tools
+                    request_params["tool_choice"] = "auto"
+                    debug_log(f"Added {len(openai_tools)} tools to request")
+                else:
+                    debug_log("No tools available to add to request")
+            else:
+                debug_log("Failed to fetch MCP tools")
+
+        debug_log(f"Calling chat completions with {len(request_params.get('tools', []))} tools")
+        response = client.chat.completions.create(**request_params)
+
+        # Handle tool calls if present
+        response_message = response.choices[0].message
+        tool_calls = getattr(response_message, 'tool_calls', None)
+
+        if tool_calls and len(tool_calls) > 0:
+            debug_log(f"Model requested {len(tool_calls)} tool calls")
+
+            # Add assistant's response with tool calls to messages
+            messages.append(response_message)
+
+            # Execute each tool call - add placeholder responses
+            # TODO: Implement actual MCP tool execution via LiteLLM proxy
+            for tool_call in tool_calls:
+                function_name = tool_call.function.name
+                function_args = tool_call.function.arguments
+
+                debug_log(f"Tool call requested: {function_name} with args: {function_args}")
+
+                # Placeholder response - in production this would execute via MCP
+                messages.append({
+                    "role": "tool",
+                    "tool_call_id": tool_call.id,
+                    "name": function_name,
+                    "content": f"Tool execution via MCP is being set up. Tool {function_name} was called with arguments: {function_args}"
+                })
+
+            # Get final response from model after tool execution
+            debug_log("Getting final response after tool execution")
+            final_response = client.chat.completions.create(**request_params)
+            return final_response.choices[0].message.content
+
        return response.choices[0].message.content
+
    except Exception as e:
-        return f"Error: {str(e)}"
+        error_msg = f"Error calling LiteLLM API: {str(e)}"
+        print(error_msg)
+        debug_log(f"Exception details: {e}")
+        return error_msg

@bot.event
 async def on_message(message):
@@ -101,6 +394,10 @@ async def on_message(message):
    if message.author == bot.user:
        return

+    # Ignore system messages (thread starter, pins, etc.)
+    if message.type != discord.MessageType.default and message.type != discord.MessageType.reply:
+        return
+
    should_respond = False

    # Check if bot was mentioned
@@ -111,10 +408,32 @@ async def on_message(message):
    if isinstance(message.channel, discord.DMChannel):
        should_respond = True

+    # Check if message is in a thread
+    if isinstance(message.channel, discord.Thread):
+        # Check if thread was started from a bot message
+        try:
+            starter = message.channel.starter_message
+            if not starter:
+                starter = await message.channel.fetch_message(message.channel.id)
+
+            # If thread was started from bot's message, auto-respond
+            if starter and starter.author.id == bot.user.id:
+                should_respond = True
+                debug_log("Thread started by bot - auto-responding")
+            # If thread started from user message, only respond if mentioned
+            elif bot.user in message.mentions:
+                should_respond = True
+                debug_log("Thread started by user - responding due to mention")
+        except Exception as e:
+            debug_log(f"Could not determine thread starter: {e}")
+            # Default: only respond if mentioned
+            if bot.user in message.mentions:
+                should_respond = True
+
    if should_respond:
        async with message.channel.typing():
-            # Get chat history
-            history = await get_chat_history(message.channel)
+            # Get chat history with proper conversation format
+            history_messages = await get_chat_history(message.channel, bot.user.id)

            # Remove bot mention from the message
            user_message = message.content.replace(f'<@{bot.user.id}>', '').strip()
@@ -125,10 +444,16 @@ async def on_message(message):
                if any(attachment.filename.lower().endswith(ext) for ext in ['.png', '.jpg', '.jpeg', '.gif', '.webp']):
                    image_urls.append(attachment.url)

-            # Get AI response
-            response = await get_ai_response(history, user_message, image_urls)
+            # Get AI response with proper conversation history
+            response = await get_ai_response(history_messages, user_message, image_urls if image_urls else None)

-            # Send response
+            # Send response (split if too long for Discord's 2000 char limit)
+            if len(response) > 2000:
+                # Split into chunks
+                chunks = [response[i:i+2000] for i in range(0, len(response), 2000)]
+                for chunk in chunks:
+                    await message.reply(chunk)
+            else:
                await message.reply(response)

    await bot.process_commands(message)
@@ -139,10 +464,16 @@ async def on_ready():


 def main():
-    if not all([DISCORD_TOKEN, OPENAI_API_KEY, OPENWEBUI_API_BASE, MODEL_NAME]):
+    if not all([DISCORD_TOKEN, LITELLM_API_KEY, LITELLM_API_BASE, MODEL_NAME]):
        print("Error: Missing required environment variables")
+        print(f"DISCORD_TOKEN: {'✓' if DISCORD_TOKEN else '✗'}")
+        print(f"LITELLM_API_KEY: {'✓' if LITELLM_API_KEY else '✗'}")
+        print(f"LITELLM_API_BASE: {'✓' if LITELLM_API_BASE else '✗'}")
+        print(f"MODEL_NAME: {'✓' if MODEL_NAME else '✗'}")
        return

+    print(f"System Prompt loaded from: {SYSTEM_PROMPT_FILE}")
+    print(f"Max history tokens: {MAX_HISTORY_TOKENS}")
    bot.run(DISCORD_TOKEN)

 if __name__ == "__main__":
--- a/scripts/requirements.txt
+++ b/scripts/requirements.txt
@@ -1,4 +1,6 @@
-discord.py
-openai
-python-dotenv
-requests
+discord.py>=2.0.0
+openai>=1.0.0
+python-dotenv>=1.0.0
+aiohttp>=3.8.0
+tiktoken>=0.5.0
+httpx>=0.25.0
--- a/scripts/system_prompt.txt
+++ b/scripts/system_prompt.txt
@@ -0,0 +1,18 @@
+You are a helpful AI assistant integrated into Discord. Users will interact with you by mentioning you, sending direct messages, or chatting in threads.
+
+Key behaviors:
+- Be concise and friendly in your responses
+- Use Discord markdown formatting when helpful (code blocks, bold, italics, etc.)
+- When users attach images, analyze them and provide relevant insights
+- Keep track of conversation context from the chat history provided
+- In threads, you have access to the full conversation context - reference previous messages when relevant
+- In regular channels, you only see messages where you were mentioned
+- If you're unsure about something, acknowledge it honestly
+- Provide helpful and accurate information
+
+Tool capabilities:
+- You have access to various tools and integrations (like GitHub, file systems, etc.) that can help you accomplish tasks
+- When appropriate, use available tools to provide more accurate and helpful responses
+- If you use a tool, explain what you're doing so users understand the process
+
+You are an AI assistant, not a human. Be transparent about your capabilities and limitations.
--- a/v2/.env.example
+++ b/v2/.env.example
@@ -1,4 +1,24 @@
+# Discord Bot Token - Get from https://discord.com/developers/applications
 DISCORD_TOKEN=your_discord_bot_token
-OPENAI_API_KEY=your_openai_api_key
-OPENWEBUI_API_BASE=http://your.api.endpoint/v1
-MODEL_NAME="Your_Model_Name"
+
+# LiteLLM API Configuration
+LITELLM_API_KEY=sk-1234
+LITELLM_API_BASE=http://localhost:4000
+
+# Model name (any model supported by your LiteLLM proxy)
+MODEL_NAME=gpt-4-turbo-preview
+
+# System Prompt Configuration (optional)
+SYSTEM_PROMPT_FILE=./system_prompt.txt
+
+# Maximum tokens to use for conversation history (optional, default: 3000)
+MAX_HISTORY_TOKENS=3000
+
+# Enable debug logging (optional, default: false)
+# Set to 'true' to see detailed logs for troubleshooting
+DEBUG_LOGGING=false
+
+# Enable MCP tools integration (optional, default: false)
+# Set to 'true' to allow the bot to use tools configured in your LiteLLM proxy
+# Tools are auto-executed without user confirmation
+ENABLE_TOOLS=false
--- a/v2/requirements.txt
+++ b/v2/requirements.txt
@@ -1,4 +1,5 @@
-discord.py
-openai
-python-dotenv
-requests
+discord.py>=2.0.0
+openai>=1.0.0
+python-dotenv>=1.0.0
+aiohttp>=3.8.0
+tiktoken>=0.5.0