LLM

The LLM pipeline runs prompts through a large language model (LLM). This pipeline autodetects the LLM framework based on the model path.

Example

The following shows a simple example using this pipeline.

from txtai import LLM
# Create LLM pipeline
llm = LLM()
# Run prompt
llm(
  """
  Answer the following question using the provided context.
  Question:
  What are the applications of txtai?
  Context:
  txtai is an open-source platform for semantic search and
  workflows powered by language models.
  """
)
# Chat messages are also supported
llm([
  {"role": "system", "content": "You are a friendly assistant."},
  {"role": "user", "content": "Answer the following question..."}
])

The LLM pipeline automatically detects the underlying LLM framework. This can also be manually set.

See the LiteLLM documentation for the options available with LiteLLM models. llama.cpp models support both local and remote GGUF paths on the HF Hub.

from txtai import LLM
# Transformers
llm = LLM("meta-llama/Meta-Llama-3.1-8B-Instruct")
llm = LLM("meta-llama/Meta-Llama-3.1-8B-Instruct", method="transformers")
# llama.cpp
llm = LLM("microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-gguf")
llm = LLM("microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-gguf",
           method="llama.cpp")
# LiteLLM
llm = LLM("ollama/llama3.1")
llm = LLM("ollama/llama3.1", method="litellm")
# Custom Ollama endpoint
llm = LLM("ollama/llama3.1", api_base="http://localhost:11434")
# Custom OpenAI-compatible endpoint
llm = LLM("openai/llama3.1", api_base="http://localhost:4000")
# LLM APIs - must also set API key via environment variable
llm = LLM("gpt-4o")
llm = LLM("claude-3-5-sonnet-20240620")

Models can be externally loaded and passed to pipelines. This is useful for models that are not yet supported by Transformers and/or need special initialization.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from txtai import LLM
# Load Phi 3.5-mini
path = "microsoft/Phi-3.5-mini-instruct"
model = AutoModelForCausalLM.from_pretrained(
  path,
  torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(path)
llm = LLM((model, tokenizer))

See the links below for more detailed examples.

Notebook	Description
Prompt-driven search with LLMs	Embeddings-guided and Prompt-driven search with Large Language Models (LLMs)
Prompt templates and task chains	Build model prompts and connect tasks together with workflows
Build RAG pipelines with txtai	Guide on retrieval augmented generation including how to create citations
Integrate LLM frameworks	Integrate llama.cpp, LiteLLM and custom generation frameworks
Generate knowledge with Semantic Graphs and RAG	Knowledge exploration and discovery with Semantic Graphs and RAG
Build knowledge graphs with LLMs	Build knowledge graphs with LLM-driven entity extraction
Advanced RAG with graph path traversal	Graph path traversal to collect complex sets of data for advanced RAG
Advanced RAG with guided generation	Retrieval Augmented and Guided Generation
RAG with llama.cpp and external API services	RAG with additional vector and LLM frameworks
How RAG with txtai works	Create RAG processes, API services and Docker instances
Speech to Speech RAG ▶️	Full cycle speech to speech workflow with RAG
Generative Audio	Storytelling with generative audio workflows

Configuration-driven example

Pipelines are run with Python or configuration. Pipelines can be instantiated in configuration using the lower case name of the pipeline. Configuration-driven pipelines are run with workflows or the API.

config.yml

# Create pipeline using lower case class name
llm:
# Run pipeline with workflow
workflow:
  llm:
    tasks:
      - action: llm

Similar to the Python example above, the underlying Hugging Face pipeline parameters and model parameters can be set in pipeline configuration.

llm:
  path: microsoft/Phi-3.5-mini-instruct
  torch_dtype: torch.bfloat16

Run with Workflows

from txtai import Application
# Create and run pipeline with workflow
app = Application("config.yml")
list(app.workflow("llm", [
  """
  Answer the following question using the provided context.
  Question:
  What are the applications of txtai? 
  Context:
  txtai is an open-source platform for semantic search and
  workflows powered by language models.
  """
]))

Run with API

CONFIG=config.yml uvicorn "txtai.api:app" &
curl \
  -X POST "http://localhost:8000/workflow" \
  -H "Content-Type: application/json" \
  -d '{"name":"llm", "elements": ["Answer the following question..."]}'

Methods

Python documentation for the pipeline.

`init(path=None, method=None, **kwargs)`

Creates a new LLM.

Parameters:

Name	Description	Default
`path`	model path	`None`
`method`	llm model framework, infers from path if not provided	`None`
`kwargs`	model keyword arguments	`{}`

Source code in txtai/pipeline/llm/llm.py

def init(self, path=None, method=None, kwargs):
    “””
    Creates a new LLM.
    Args:
        path: model path
        method: llm model framework, infers from path if not provided
        kwargs: model keyword arguments
    “””
    # Default LLM if not provided
    path = path if path else “google/flan-t5-base”
    # Generation instance
    self.generator = GenerationFactory.create(path, method, kwargs)

`call(text, maxlength=512, stream=False, **kwargs)`

Generates text. Supports the following input formats:

String or list of strings
List of dictionaries with role and content key-values or lists of lists