Azure OpenAI binding spec
Detailed documentation on the Azure OpenAI binding component
Component format
To setup an Azure OpenAI binding create a component of type bindings.azure.openai
. See this guide on how to create and apply a binding configuration. See this for the documentation for Azure OpenAI Service.
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: <NAME>
spec:
type: bindings.azure.openai
version: v1
metadata:
- name: apiKey # Required
value: "1234567890abcdef"
- name: endpoint # Required
value: "https://myopenai.openai.azure.com"
Warning
The above example uses apiKey
as a plain string. It is recommended to use a secret store for the secrets as described here.
Spec metadata fields
Field | Required | Binding support | Details | Example |
---|---|---|---|---|
endpoint | Y | Output | Azure OpenAI service endpoint URL. | “https://myopenai.openai.azure.com“ |
apiKey | Y | Output | The access key of the Azure OpenAI service. Only required when not using Microsoft Entra ID authentication. | “1234567890abcdef” |
azureTenantId | Y | Input | The tenant ID of the Azure OpenAI resource. Only required when apiKey is not provided. | “tenentID” |
azureClientId | Y | Input | The client ID that should be used by the binding to create or update the Azure OpenAI Subscription and to authenticate incoming messages. Only required when apiKey is not provided. | “clientId” |
azureClientSecret | Y | Input | The client secret that should be used by the binding to create or update the Azure OpenAI Subscription and to authenticate incoming messages. Only required when apiKey is not provided. | “clientSecret” |
Microsoft Entra ID authentication
The Azure OpenAI binding component supports authentication using all Microsoft Entra ID mechanisms. For further information and the relevant component metadata fields to provide depending on the choice of Microsoft Entra ID authentication mechanism, see the docs for authenticating to Azure.
Example Configuration
apiVersion: dapr.io/v1alpha1
kind: component
metadata:
name: <NAME>
spec:
type: bindings.azure.openai
version: v1
metadata:
- name: endpoint
value: "https://myopenai.openai.azure.com"
- name: azureTenantId
value: "***"
- name: azureClientId
value: "***"
- name: azureClientSecret
value: "***"
Binding support
This component supports output binding with the following operations:
completion
: Completion APIchat-completion
: Chat Completion APIget-embedding
: Embedding API
Completion API
To call the completion API with a prompt, invoke the Azure OpenAI binding with a POST
method and the following JSON body:
{
"operation": "completion",
"data": {
"deploymentId": "my-model",
"prompt": "A dog is",
"maxTokens":5
}
}
The data parameters are:
deploymentId
- string that specifies the model deployment ID to use.prompt
- string that specifies the prompt to generate completions for.maxTokens
- (optional) defines the max number of tokens to generate. Defaults to 16 for completion API.temperature
- (optional) defines the sampling temperature between 0 and 2. Higher values like 0.8 make the output more random, while lower values like 0.2 make it more focused and deterministic. Defaults to 1.0 for completion API.topP
- (optional) defines the sampling temperature. Defaults to 1.0 for completion API.n
- (optional) defines the number of completions to generate. Defaults to 1 for completion API.presencePenalty
- (optional) Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics. Defaults to 0.0 for completion API.frequencyPenalty
- (optional) Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim. Defaults to 0.0 for completion API.
Read more about the importance and usage of these parameters in the Azure OpenAI API documentation.
示例
curl -d '{ "data": {"deploymentId: "my-model" , "prompt": "A dog is ", "maxTokens":15}, "operation": "completion" }' \
http://localhost:<dapr-port>/v1.0/bindings/<binding-name>
Response
The response body contains the following JSON:
[
{
"finish_reason": "length",
"index": 0,
"text": " a pig in a dress.\n\nSun, Oct 20, 2013"
},
{
"finish_reason": "length",
"index": 1,
"text": " the only thing on earth that loves you\n\nmore than he loves himself.\"\n\n"
}
]
Chat Completion API
To perform a chat-completion operation, invoke the Azure OpenAI binding with a POST
method and the following JSON body:
{
"operation": "chat-completion",
"data": {
"deploymentId": "my-model",
"messages": [
{
"role": "system",
"message": "You are a bot that gives really short replies"
},
{
"role": "user",
"message": "Tell me a joke"
}
],
"n": 2,
"maxTokens": 30,
"temperature": 1.2
}
}
The data parameters are:
deploymentId
- string that specifies the model deployment ID to use.messages
- array of messages that will be used to generate chat completions. Each message is of the form:role
- string that specifies the role of the message. Can be eitheruser
,system
orassistant
.message
- string that specifies the conversation message for the role.
maxTokens
- (optional) defines the max number of tokens to generate. Defaults to 16 for the chat completion API.temperature
- (optional) defines the sampling temperature between 0 and 2. Higher values like 0.8 make the output more random, while lower values like 0.2 make it more focused and deterministic. Defaults to 1.0 for the chat completion API.topP
- (optional) defines the sampling temperature. Defaults to 1.0 for the chat completion API.n
- (optional) defines the number of completions to generate. Defaults to 1 for the chat completion API.presencePenalty
- (optional) Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics. Defaults to 0.0 for the chat completion API.frequencyPenalty
- (optional) Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim. Defaults to 0.0 for the chat completion API.
Example
curl -d '{
"data": {
"deploymentId": "my-model",
"messages": [
{
"role": "system",
"message": "You are a bot that gives really short replies"
},
{
"role": "user",
"message": "Tell me a joke"
}
],
"n": 2,
"maxTokens": 30,
"temperature": 1.2
},
"operation": "chat-completion"
}' \
http://localhost:<dapr-port>/v1.0/bindings/<binding-name>
Response
The response body contains the following JSON:
[
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "Why was the math book sad? Because it had too many problems.",
"role": "assistant"
}
},
{
"finish_reason": "stop",
"index": 1,
"message": {
"content": "Why did the tomato turn red? Because it saw the salad dressing!",
"role": "assistant"
}
}
]
Get Embedding API
The get-embedding
operation returns a vector representation of a given input that can be easily consumed by machine learning models and other algorithms. To perform a get-embedding
operation, invoke the Azure OpenAI binding with a POST
method and the following JSON body:
{
"operation": "get-embedding",
"data": {
"deploymentId": "my-model",
"message": "The capital of France is Paris."
}
}
The data parameters are:
deploymentId
- string that specifies the model deployment ID to use.message
- string that specifies the text to embed.
Example
curl -d '{
"data": {
"deploymentId": "embeddings",
"message": "The capital of France is Paris."
},
"operation": "get-embedding"
}' \
http://localhost:<dapr-port>/v1.0/bindings/<binding-name>
Response
The response body contains the following JSON:
[0.018574921,-0.00023652936,-0.0057790717,.... (1536 floats total for ada)]
Learn more about the Azure OpenAI output binding
Watch the following Community Call presentation to learn more about the Azure OpenAI output binding.