This guide walks you through setting up the AI Proxy plugin with Cohere.
For all providers, the Kong AI Proxy plugin attaches to route entities.
It can be installed into one route per operation, for example:
- OpenAI
chat
route - Cohere
chat
route - Cohere
completions
route
Each of these AI-enabled routes must point to a null service. This service doesn’t need to map to any real upstream URL, it can point somewhere empty (for example, http://localhost:32000
), because the AI Proxy plugin overwrites the upstream URL. This requirement will be removed in a later Kong revision.
Prerequisites
- Cohere account and subscription
- You need a service to contain the route for the LLM provider. Create a service first:
curl -X POST http://localhost:8001/services \
--data "name=ai-proxy" \
--data "url=http://localhost:32000"
Remember that the upstream URL can point anywhere empty, as it won’t be used by the plugin.
Provider configuration
After creating a Cohere account and purchasing a subscription, you can then create an AI Proxy route and plugin configuration.
Set up route and plugin
Kong Admin API
YAML
Create the route:
curl -X POST http://localhost:8001/services/ai-proxy/routes \
--data "name=cohere-chat" \
--data "paths[]=~/cohere-chat$"
Enable and configure the AI Proxy plugin for Cohere, replacing the <cohere_key>
with your own API key:
curl -X POST http://localhost:8001/routes/cohere-chat/plugins \
--data "name=ai-proxy" \
--data "config.route_type=llm/v1/chat" \
--data "config.auth.header_name=Authorization" \
--data "config.auth.header_value=Bearer <cohere_key>" \
--data "config.model.provider=cohere" \
--data "config.model.name=command" \
--data "config.model.options.max_tokens=512" \
--data "config.model.options.temperature=1.0"
name: cohere-chat
paths:
- "~/cohere-chat$"
methods:
- POST
plugins:
- name: ai-proxy
config:
route_type: "llm/v1/chat"
auth:
header_name: "Authorization"
header_value: "Bearer <cohere_key>" # add your own Cohere API key
model:
provider: "cohere"
name: "command"
options:
max_tokens: 512
temperature: 1.0
Test the configuration
Make an llm/v1/chat
type request to test your new endpoint:
curl -X POST http://localhost:8000/cohere-chat \
-H 'Content-Type: application/json' \
--data-raw '{ "messages": [ { "role": "system", "content": "You are a mathematician" }, { "role": "user", "content": "What is 1+1?"} ] }'