Function Description
Implement LLM-RAG by integrating with Alibaba Cloud Vector Search Service, as shown in the figure below:
Running Attributes
Plugin execution phase: Default Phase
Plugin execution priority: 400
Configuration Description
Name | Data Type | Requirement | Default Value | Description |
---|---|---|---|---|
dashscope.apiKey | string | Required | - | Token used for authentication when accessing Tongyi Qianwen service. |
dashscope.serviceFQDN | string | Required | - | Tongyi Qianwen service name |
dashscope.servicePort | int | Required | - | Tongyi Qianwen service port |
dashscope.serviceHost | string | Required | - | Domain name for accessing Tongyi Qianwen service |
dashvector.apiKey | string | Required | - | Token used for authentication when accessing Alibaba Cloud Vector Search Service. |
dashvector.serviceFQDN | string | Required | - | Alibaba Cloud Vector Search service name |
dashvector.servicePort | int | Required | - | Alibaba Cloud Vector Search service port |
dashvector.serviceHost | string | Required | - | Domain name for accessing Alibaba Cloud Vector Search service |
dashvector.topk | int | Required | - | Number of vectors to retrieve from Alibaba Cloud Vector Search |
dashvector.threshold | float | Required | - | Vector distance threshold; documents above this threshold will be filtered out |
dashvector.field | string | Required | - | Field name where documents are stored in Alibaba Cloud Vector Search |
Once the plugin is enabled, while using the tracing feature, the document ID information retrieved by RAG will be added to the span’s attributes for troubleshooting purposes.
Example
The CEC-Corpus dataset contains 332 news reports on emergency events, along with annotation data. The original news text is extracted, vectorized, and then added to Alibaba Cloud Vector Search Service. For text vectorization tutorials, you can refer to “Implementing Semantic Search Based on Vector Search Service and Lingji”.
Below is an example enhanced using RAG, with the original request being:
The result returned by LLM without RAG plugin processing was:
After processing with RAG plugin, the result returned by LLM was: