This guide walks you through setting up the AI Proxy plugin with Mistral.
Mistral is a self-hosted model. As such, it requires setting model option upstream_url to point to the absolute HTTP(S) endpoint for this model implementation.
There are a number of hosting/format options for running this LLM. Mistral offers a cloud-hosted service for consuming the LLM, available at Mistral.ai
Self-hosted options include:
Upstream Formats
The “upstream” request and response formats are different between various implementations of Mistral, and/or its accompanying web-server.
For this provider, the following should be used for the config.model.options.mistral_format parameter:
Mistral Hosting | mistral_format Config Value | Auth Header |
---|---|---|
Mistral.ai | openai | Authorization |
OLLAMA | ollama | Not required by default |
Self-Hosted GGUF | openai | Not required by default |
OpenAI Format
The openai
format option follows the same upstream formats as the equivalent OpenAI route type operation (that is, llm/v1/chat
or llm/v1/completions
).
This format should be used when configuring the cloud-based https://mistral.ai/.
It will transform both llm/v1/chat
and llm/v1/completions
type route requests, into the same format.
Ollama Format
The ollama
format option adheres to the chat
and chat-completion
request formats, as defined in its API documentation.
Using the plugin with Mistral
For all providers, the Kong AI Proxy plugin attaches to route entities.
It can be installed into one route per operation, for example:
- OpenAI
chat
route - Cohere
chat
route - Cohere
completions
route
Each of these AI-enabled routes must point to a null service. This service doesn’t need to map to any real upstream URL, it can point somewhere empty (for example, http://localhost:32000
), because the AI Proxy plugin overwrites the upstream URL. This requirement will be removed in a later Kong revision.
Prerequisites
You need a service to contain the route for the LLM provider. Create a service first:
curl -X POST http://localhost:8001/services \
--data "name=ai-proxy" \
--data "url=http://localhost:32000"
Remember that the upstream URL can point anywhere empty, as it won’t be used by the plugin.
Set up route and plugin
After installing and starting your Mistral instance, you can then create an AI Proxy route and plugin configuration.
Kong Admin API
YAML
Create the route:
curl -X POST http://localhost:8001/services/ai-proxy/routes \
--data "name=mistral-chat" \
--data "paths[]=~/mistral-chat$"
Enable and configure the AI Proxy plugin for Mistral (using openai
format in this example):
curl -X POST http://localhost:8001/routes/mistral-chat/plugins \
--data "name=ai-proxy" \
--data "config.route_type=llm/v1/chat" \
--data "config.auth.header_name=Authorization" \
--data "config.auth.header_value=Bearer <MISTRAL_AI_KEY>" \
--data "config.model.provider=mistral" \
--data "config.model.name=mistral-tiny" \
--data "config.model.options.mistral_format=openai" \
--data "config.model.options.upstream_url=https://api.mistral.ai/v1/chat/completions" \
name: mistral-chat
paths:
- "~/mistral-chat$"
methods:
- POST
plugins:
- name: ai-proxy
config:
route_type: "llm/v1/chat"
auth:
header_name: "Authorization"
header_value: "Bearer <MISTRAL_AI_KEY>"
model:
provider: "mistral"
name: "mistral-tiny"
options:
mistral_format: "openai"
upstream_url: "https://api.mistral.ai/v1/chat/completions"
Test the configuration
Make an llm/v1/chat
type request to test your new endpoint:
curl -X POST http://localhost:8000/mistral-chat \
-H 'Content-Type: application/json' \
--data-raw '{ "messages": [ { "role": "system", "content": "You are a mathematician" }, { "role": "user", "content": "What is 1+1?"} ] }'
Previous Set up AI Proxy with Llama2