Major refactoring to properly integrate with LiteLLM's Responses API, which handles
MCP tool execution automatically instead of requiring manual tool call loops.
Key changes:
- Switched from chat.completions.create() to client.responses.create()
- Use "server_url": "litellm_proxy" to leverage LiteLLM as MCP gateway
- Set "require_approval": "never" for fully automatic tool execution
- Simplified get_available_mcp_tools() to get_available_mcp_servers()
- Removed manual OpenAI tool format conversion (LiteLLM handles this)
- Updated response extraction to use output[0].content[0].text format
- Convert system prompts to user role for Responses API compatibility
Technical improvements:
- LiteLLM now handles the complete tool calling loop automatically
- No more placeholder responses - actual MCP tools will execute
- Cleaner code with ~100 fewer lines
- Better separation between tools-enabled and tools-disabled paths
- Proper error handling for Responses API format
Responses API benefits:
- Single API call returns final response with tool results integrated
- Automatic tool discovery, execution, and result formatting
- No manual tracking of tool_call_ids or conversation state
- Native MCP support via server_label configuration
Documentation:
- Added comprehensive litellm-mcp-research.md with API examples
- Documented Responses API vs chat.completions differences
- Included Discord bot migration patterns
- Covered authentication, streaming, and tool restrictions
Next steps:
- Test with actual Discord interactions
- Verify GitHub MCP tools execute correctly
- Monitor response extraction for edge cases
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>